Interpretable Data-Driven Prediction, Optimization, and Decision-Making for Coking Coal Flotation

Wang, Ying; Cui, Deqian

doi:10.3390/pr14081289

Open AccessArticle

Interpretable Data-Driven Prediction, Optimization, and Decision-Making for Coking Coal Flotation

by

Ying Wang

and

Deqian Cui

^*

School of Management, China University of Mining and Technology-Beijing, Beijing 100083, China

^*

Author to whom correspondence should be addressed.

Processes 2026, 14(8), 1289; https://doi.org/10.3390/pr14081289

Submission received: 17 March 2026 / Revised: 15 April 2026 / Accepted: 16 April 2026 / Published: 17 April 2026

(This article belongs to the Special Issue Mineral Processing Equipments and Cross-Disciplinary Approaches)

Download

Browse Figures

Versions Notes

Abstract

Coking coal flotation is a typical nonlinear, multi-variable, and multi-objective process in which concentrate quality and combustible matter recovery must be balanced under fluctuating feed and operating conditions. To improve both predictive reliability and decision support, this study proposes an integrated data-driven framework that combines particle swarm optimization-back propagation (PSO-BP) prediction, SHapley Additive exPlanations (SHAP) based interpretation, Non-dominated Sorting Genetic Algorithm II (NSGA-II) optimization, and entropy-weighted Technique for Order Preference by Similarity to Ideal Solution (Entropy-TOPSIS) decision-making. After three-sigma outlier screening, 2000 valid distributed control system (DCS) samples were retained for model development and temporal holdout evaluation, and an additional 200 later-period industrial samples were used for independent validation. The data were partitioned chronologically, with months 1–4, month 5, and month 6 used for training, validation, and temporal holdout testing, respectively, while the months 7–8 dataset was reserved for later-period validation. The results show that PSO-BP consistently outperformed conventional BP under both temporal holdout and later-period validation. SHAP analysis identified raw coal ash and collector dosage as the dominant factors for product-quality prediction, while collector dosage and frother dosage contributed most strongly to tailing heat of combustion. NSGA-II further revealed the trade-off among clean coal ash, clean coal sulfur, and tailing heat of combustion, and Entropy-TOPSIS converted the Pareto-optimal candidate set into a practically balanced operating recommendation. Sensitivity and robustness analyses indicated acceptable stability of both the optimization process and the final decision result. Overall, the proposed framework provides an interpretable prediction–optimization–decision workflow for coking coal flotation and offers a practical basis for future DCS-assisted intelligent regulation.

Keywords:

coking coal flotation; PSO-BP neural network; SHAP interpretability; NSGA-II; Entropy-TOPSIS; data-driven optimization

Graphical Abstract

1. Introduction

Coking coal is a key raw material for blast-furnace ironmaking and steel production, and its efficient utilization remains of great industrial importance. Among the available upgrading technologies, froth flotation is widely used for the separation of fine coal because of its ability to remove ash-forming mineral matter and improve product quality [1]. However, the coking coal flotation process is inherently complex due to the strong coupling among feed properties, reagent regime, hydrodynamic conditions, and separation responses [2,3]. In industrial practice, operators must continuously balance multiple conflicting objectives, especially the trade-off between clean coal quality and resource recovery. This makes flotation control and optimization a typical nonlinear, multi-variable, and multi-objective problem [4].

The flotation performance of coking coal is jointly affected by both feed characteristics and operating variables. Previous studies have shown that raw coal ash, particle size distribution, collector dosage [5,6,7], frother dosage, pulp solids mass fraction, and air volumetric flow rate all influence combustible recovery, product ash, and separation selectivity to different degrees [8,9]. Compared with approaches relying solely on froth images, process-parameter-based modeling has the advantage of being more directly linked to operational adjustment, making it more suitable for process optimization and engineering decision support [10,11]. Therefore, establishing a reliable quantitative relationship between key process variables and flotation outcomes is essential for improving operational stability and supporting data-driven setpoint adjustment.

With the rapid development of artificial intelligence and machine learning, data-driven modeling has become an important tool for flotation prediction. Early studies demonstrated the feasibility of using Back Propagation (BP) neural networks to predict flotation performance from process or image features. Subsequently [12,13], Radial Basis Function (RBF) networks, Long Short-Term Memory (LSTM) models [14,15], and a variety of deep learning architectures further improved predictive accuracy in different flotation scenarios. In recent years, Convolutional Neural Networks (CNN), Convolutional Neural Networks-Back Propagation (CNN-BP) hybrid models, (Convolutional Neural Networks- Long Short-Term Memory) CNN-LSTM frameworks [16,17], and multi-scale networks have been increasingly applied to froth image recognition and quality prediction [18,19]. These studies have significantly advanced flotation monitoring and soft sensing. Nevertheless, image-based deep learning models often depend strongly on imaging quality, illumination conditions, and hardware stability [20,21,22]. For the present task, which focuses on structured industrial process variables and a limited number of plant samples, a compact surrogate model with stable training behavior, interpretable post hoc analysis, and a relatively smooth response surface is more suitable for subsequent optimization [23]. In this context, particle swarm optimization-back propagation (PSO-BP) provides a practical compromise between predictive capability and engineering applicability.

Despite these advances, two important gaps remain in current research. First, many prediction studies focus primarily on improving accuracy, while providing limited analysis of how model responses relate to known flotation behavior [24,25]. For industrial deployment, it is not sufficient for a model to be accurate; its predictions should also be interpretable in terms of variable importance, nonlinear response regions, and process-relevant trends. Second, existing optimization studies often stop at the level of obtaining Pareto-optimal solution sets. Although evolutionary algorithms such as Non-dominated Sorting Genetic Algorithm II (NSGA-II) can effectively characterize trade-offs among conflicting objectives [26], plant engineers still face the practical challenge of selecting a single operating condition from multiple mathematically feasible candidates [27,28]. Without a transparent decision-making layer, it remains difficult to translate optimization results into actionable operating recommendations.

To address these issues, this study develops an interpretable data-driven framework for coking coal flotation that integrates prediction, interpretation, optimization, and decision-making [29,30]. First, a PSO-BP model is established to describe the nonlinear relationships between key process variables and flotation indicators [31]. Second, SHapley Additive exPlanations (SHAP) and partial dependence analysis are introduced to quantify feature contributions and reveal major nonlinear response patterns, thereby improving the interpretability of the predictive model [32]. Third, NSGA-II is employed to generate Pareto-optimal operating solutions under multiple objectives, and an entropy-weighted Technique for Order Preference by Similarity to Ideal Solution (Entropy-TOPSIS) method is further used to rank the Pareto set and recommend a practically preferable operating point [33,34]. The novelty of this work lies not in proposing entirely new standalone algorithms, but in constructing a coherent prediction–interpretation–optimization decision workflow tailored to the operational characteristics of coking coal flotation.

2. Methodology

2.1. Process Description and Dataset Construction

2.1.1. Variable System Definition

Coking coal flotation is a typical gas–liquid–solid multiphase separation process in which reagent chemistry, feed properties, and hydrodynamic conditions jointly determine separation performance. In order to establish a compact yet operationally meaningful modeling framework, the input variables were selected according to two principles: process relevance and controllability [28,29]. Specifically, collector dosage (

D_{c}

), frother dosage (

D_{f}

), pulp solids mass fraction (

w_{S p u l p}

), and air volumetric flow rate (

q_{V}

) were treated as manipulated variables because they can be directly adjusted during plant operation and have well-recognized influences on bubble–particle attachment, froth stability, and separation selectivity [12,30]. Raw coal ash mass fraction was introduced as a state variable to represent feed quality fluctuations, which are not directly controllable but substantially affect flotation responses. The schematic diagram of the flotation principle and process is illustrated in Figure 1, where the main manipulated variables and representative hydrodynamic descriptors, including bubble diameter and slurry residence time, are highlighted to improve the physical interpretability of the process description [35].

The output vector was defined by three process-performance indicators, namely clean coal ash mass fraction (

w_{A c l e a n}

), clean coal sulfur mass fraction (

w_{S c l e a n}

), and tailing heat of combustion (

e_{t a i l}

). These variables were selected because they jointly reflect product quality, environmental compliance, and combustible matter loss in tailings. Among them, clean coal ash represents the primary quality index, clean coal sulfur is closely related to coke-making suitability and emission requirements, and tailing heat of combustion serves as an engineering proxy for resource loss. The resulting input–output structure therefore provides a physically meaningful and decision-oriented basis for subsequent prediction, interpretation, and multi-objective optimization. The variable definitions and operating ranges are summarized in Table 1.

2.1.2. Data Preprocessing and Multidimensional Sampling Strategy

The core dataset used for model development was obtained from the Distributed Control System (DCS) of a coking coal preparation plant in Shanxi Province, China, covering the first six months of relatively stable industrial operation. Initially, the raw DCS dataset contained 2025 samples. After outlier screening based on the three-sigma rule, 25 abnormal samples were removed, yielding 2000 valid samples for model development and temporal holdout evaluation. These 2000 samples were partitioned chronologically into a training subset from months 1–4 (n = 1500), a validation subset from month 5 (n = 300), and a temporal holdout test subset from month 6 (n = 200). In addition, a later-period industrial dataset collected during months 7–8 (n = 200) was introduced as an independent validation set to further examine the chronological generalizability of the proposed framework. The statistical characteristics of the 2000-sample development-and-holdout dataset are summarized in Table 2. The standard deviation values are expressed in the same units as the corresponding variables. As shown in the table, the retained samples cover a sufficiently broad operating range for all input and output variables, which is suitable for nonlinear relationship modeling in coking coal flotation.

Before model training, all input and output variables were normalized using min–max scaling. To avoid data leakage, the normalization parameters were estimated only from the training subset and then applied to the validation, holdout-testing, and independent-validation subsets. This treatment ensures that the reported model performance reflects prediction on unseen data rather than information leaked from the full dataset.

The sulfur-content range in the present dataset is comparatively narrow because the sampled period corresponds to relatively stable industrial operation under a relatively consistent raw-coal blending regime. Accordingly, the sulfur indicator in this study mainly reflects the variation range encountered under the current production conditions, rather than deliberately expanded extreme or abnormal sulfur scenarios.

2.2. PSO-BP Model Development

To establish a reliable nonlinear surrogate model for coking coal flotation, a hybrid PSO-BP neural network was developed in this study. The BP network was used to characterize the nonlinear mapping between process variables and flotation indicators, while Particle Swarm Optimization (PSO) was introduced to improve the initialization of network parameters before gradient-based supervised learning. The overall workflow of the PSO-BP modeling procedure is illustrated in Figure 2.

2.2.1. Network Architecture and Topology Selection

According to the variable system defined in Section 2.1, the network consisted of five input neurons corresponding to collector dosage (

D_{c}

), frother dosage (

D_{f}

), pulp solids mass fraction (

w_{S p u l p}

), and air volumetric flow rate (

q_{V}

), raw coal ash mass fraction (

w_{A r a w}

), and three output neurons corresponding to clean coal ash mass fraction (

w_{A c l e a n}

), clean coal sulfur mass fraction (

w_{S c l e a n}

), and tailing heat of combustion (

e_{t a i l}

). A single-hidden-layer feedforward architecture was adopted because of its ability to approximate nonlinear mappings with relatively low structural complexity, which is suitable for the present task involving structured industrial process variables and limited plant samples. The hidden layer used the logistic sigmoid transfer function, whereas the output layer used a linear transfer function. Accordingly, the network architecture can be denoted as 5−

n_{h}

−3, where

n_{h}

is the number of hidden neurons.

The number of hidden neurons is a key hyperparameter affecting the trade-off between predictive accuracy and computational efficiency. Therefore, a quantitative topology-sensitivity analysis was performed by varying the hidden-layer size from 5 to 25. For each candidate topology, 10 independent training trials were conducted under a fixed random-seed setting, and the average RMSE, its standard deviation, and the corresponding training time were jointly examined. As shown in Figure 3, the topology with 11 hidden neurons provided the most favorable balance between prediction accuracy, stability, and computational cost. Accordingly, the hidden-layer size was set to 11 in the subsequent PSO-BP and BP modeling procedures. This treatment reduces the risk of selecting a topology based on a single stochastic training outcome and improves the reproducibility of the architecture selection process.

The forward propagation of the network can be expressed as

h = f (W_{1} x + b_{1}), \hat{y} = g (W_{2} h + b_{2})

(1)

where

x

is the input vector,

h

is the hidden-layer output, and

\hat{y}

is the predicted output vector.

W_{1}

and

W_{2}

denote the connection weight matrices, while

b_{1}

and

b_{2}

denote the corresponding bias vectors.

2.2.2. PSO-Based Parameter Initialization and Supervised Training

To reduce the sensitivity of BP training to random initialization, PSO was used to optimize the initial weights and biases of the network. In the PSO stage, each particle represented a candidate parameter vector containing all weights and bias terms of the 5-11-3 network. Particle updating followed the standard PSO equations:

υ_{i d} (t + 1) = ω υ_{i d} (t) + c_{1} r_{1} (P_{b e s t, i d} - x_{i d} (t)) + c_{2} r_{2} (G_{b e s t, d} - x_{i d} (t))

(2)

x_{i d} (t + 1) = x_{i d} (t) + υ_{i d} (t + 1)

(3)

where

ω

is the inertia weight, and

x_{i}

and

υ_{i}

denote the position and velocity of the

i

-th particle at iterationt,

P_{b e s t}

denotes the individual best position, and

G_{b e s t}

denotes the global best position. ω is the inertia weight, and

c_{1}, c_{2}

are learning factors, and

r_{1}, r_{2}

are random numbers uniformly distributed in

[0, 1]

. The fitness function is defined as the Mean Squared Error (MSE) on the training set.

In the implemented procedure, the PSO population size was set to 30 and the maximum number of iterations was set to 20. After PSO convergence, the optimized parameter vector was assigned to the BP network as the initial weights and biases. The initialized network was then further trained using the Levenberg–Marquardt algorithm. To ensure a fair comparison, the same supervised training configuration was adopted for both PSO-BP and conventional BP, including the maximum number of epochs (1000), learning rate (0.1), training goal (

1 \times 10^{- 5}

), momentum coefficient (0.01), minimum performance gradient (

1 \times 10^{- 6}

), and maximum validation failure count (6). The implemented PSO-BP model adopted trainlm for supervised training, with logsig and purelin used in the hidden and output layers, respectively. All neural network training, data processing, and optimization procedures were implemented in MATLAB R2022b (MathWorks, Natick, MA, USA).

To reduce the influence of stochastic fluctuations, repeated runs with different random initializations were conducted for the neural-network-based models, and averaged performance was reported in the Section 3.

2.2.3. Data Partitioning and Normalization

The PSO-BP model was developed using the chronological partitioning protocol described in Section 2.1. During model development, the training subset (months 1–4) was used for parameter fitting, the validation subset (month 5) was used for model selection and early stopping, the temporal holdout subset (month 6) was used for final testing within the development period, and the later-period industrial dataset (months 7–8) was used for independent validation. This design enables the evaluation of both temporal generalizability and later-period industrial applicability.

For benchmark comparison, a conventional BP neural network was implemented using the same data partitioning and preprocessing protocol. In particular, the BP baseline adopted the same 5-11-3 architecture and the same supervised training configuration as the PSO-BP model, except that its initial weights and biases were not optimized by PSO. Model performance was evaluated using the coefficient of determination (R²), root mean square error (RMSE), and mean absolute error (MAE).

2.3. Interpretability Analysis Based on SHAP

To improve the interpretability of the PSO-BP model, SHapley Additive exPlanations (SHAP) were introduced to quantify the contribution of input variables to the model predictions. SHAP decomposes the prediction into a baseline term and the marginal contributions of individual features. For feature

i

, the SHAP value can be expressed as

ϕ_{i} (f, x) = \sum_{S \subseteq F \ \{i\}} \frac{|S|! (|F| - |S| - 1)!}{|F|!} [f (S \cup \{i\}) - f (S)]

(4)

where

F

denotes the full feature set,

S

represents a subset of features excluding feature

i

, and

f

denotes the trained prediction model. In this way, the model prediction can be decomposed into a baseline term and the marginal contributions of each input variable.

In this study, SHAP was used in two ways. First, global feature importance was evaluated using the mean absolute SHAP values to rank the contributions of the input variables to each flotation indicator. Second, SHAP dependence analysis and partial dependence analysis were used to examine nonlinear response patterns of the outputs to individual operating variables within the observed data range [26]. In addition, representative two-variable response surfaces were used to visualize coupled effects between selected operating variables.

It should be noted that the SHAP and partial dependence plot (PDP) based analyses in this study were intended for model interpretation rather than direct physicochemical verification. Therefore, the identified nonlinear response regions should be understood as data-driven patterns learned by the surrogate model, rather than direct proof of adsorption thresholds or other mechanism-level boundaries.

2.4. Multi-Objective Optimization and Decision Making

2.4.1. Multi-Objective Optimization via NSGA- II

Because coking coal flotation involves trade-offs among product quality, sulfur control, and combustible matter recovery, the optimization task was formulated as a multi-objective problem. The decision variables were defined as collector dosage (

D_{c}

), frother dosage (

D_{f}

), pulp solids mass fraction (

w_{S p u l p}

), and air volumetric flow rate (

q_{V}

) whereas raw coal ash mass fraction (

w_{A r a w}

) was treated as a state/scenario variable representing feed-condition fluctuations. The multi-objective optimization problem can be written as

F (x) = \{\begin{matrix} {P S O - B P}_{A} (x, A_{r a w}) \\ {P S O - B P}_{S} (x, A_{r a w}) \\ {P S O - B P}_{H} (x, A_{r a w}) \end{matrix}

(5)

As shown in Figure 4, the trained PSO-BP model was used as the surrogate model for objective evaluation during optimization. NSGA-II was adopted to obtain Pareto-optimal solutions because of its ability to preserve solution diversity while handling conflicting objectives. In the implemented procedure, the population size was set to 100, the maximum number of generations was set to 20, the crossover probability was set to 0.8, and the mutation rate was set to 0.05. These settings are consistent with the optimization code used in the study.

The output of NSGA-II was a Pareto-optimal solution set rather than a single operating condition. Therefore, an additional decision-making layer was required to convert the Pareto set into a practically selectable recommendation.

2.4.2. Decision Making Based on Entropy-TOPSIS

Although NSGA-II provides a Pareto frontier, plant operation typically requires a practically selectable operating condition. Therefore, Entropy Weighting and TOPSIS were used to rank the Pareto solutions and identify a preferred operating point.

First, the Pareto objective matrix was normalized, and the entropy weight of each objective was calculated as

e_{j} = - k \sum_{i = 1}^{m} p_{i j} \ln p_{i j}

(6)

ω_{j} = \frac{1 - e_{j}}{\sum_{j = 1}^{n} (1 - e_{j})}

(7)

where

p_{i j}

is the normalized proportion of solution

i

under objective

j

,

m

is the number of Pareto solutions,

k

is a normalization constant,

n

is the number of objectives.

Then, TOPSIS was used to evaluate the relative closeness of each candidate solution to the positive and negative ideal solutions. The Euclidean distances of candidate solution i to the positive ideal solution (

D_{i}^{+}

) and the negative ideal solution (

D_{i}^{-}

) were calculated as

D_{i}^{+} = \sqrt{{\sum_{j = 1}^{n} (v_{i j} - v_{j}^{+})}^{2},} D_{i}^{-} = \sqrt{{\sum_{j = 1}^{n} (v_{i j} - v_{j}^{-})}^{2}}

(8)

and the relative closeness coefficient was defined as

C_{i} = \frac{D_{i}^{-}}{{D_{i}^{+} + D}_{i}^{-}}

(9)

A larger

C_{i}

indicates that the corresponding Pareto solution is closer to the positive ideal solution and farther from the negative ideal solution. Therefore, the candidate with the highest relative closeness was selected as the preferred engineering operating point.

3. Results and Analysis

3.1. Predictive Performance of the PSO-BP Model

3.1.1. Temporal Holdout Validation

To evaluate the temporal generalization ability of the predictive model, a chronological data-partition strategy was adopted. Specifically, samples collected during the first four months were used for model training, the fifth month dataset was used for validation [36], and the dataset collected in the sixth month was reserved as the temporal holdout set. The repeated-run results on the sixth-month holdout set are summarized in Table 3. For all three outputs, PSO-BP produced lower average prediction errors and higher average

R^{2}

values than the conventional BP model. For Output 1, the average RMSE decreased from 0.033322 for BP to 0.008585 for PSO-BP, while the average

R^{2}

increased from 0.711394 to 0.981116. For Output 2, the average RMSE decreased from 0.002661 to 0.001754, and the average

R^{2}

increased from 0.511524 to 0.789748. For Output 3, the advantage of PSO-BP was even more pronounced, with the average RMSE reduced from 5.817046 to 1.244303 and the average

R^{2}

improved from 0.726766 to 0.988214. These results indicate that PSO-based initialization substantially improved the predictive performance of the BP network under temporal holdout conditions.

The dynamic comparison between observed values and model predictions on the sixth-month holdout set is shown in Figure 5, and the corresponding residual comparison is given in Figure 6. As can be seen, both models reproduce the main fluctuation trend of the process variables, but the PSO-BP predictions remain consistently closer to the observed trajectories, especially in regions with stronger local variation. The residual plots further show that PSO-BP errors are more concentrated around zero, whereas the BP residuals exhibit visibly larger oscillation amplitudes. These graphical observations are consistent with the repeated-run statistics in Table 3 and support the conclusion that PSO-BP achieved better fitting consistency and stronger temporal holdout stability on the unseen sixth-month data.

3.1.2. Later-Period Industrial Validation

To further assess model robustness beyond the development period, the datasets collected during the months 7–8 were introduced as an independent later-period industrial validation set. The repeated-run results are summarized in Table 4. The overall performance trend remained consistent with that observed in the sixth-month holdout test. For Output 1, PSO-BP achieved an average RMSE of 0.013448 and an average

R^{2}

of 0.959662, whereas BP yielded 0.035418 and 0.715919, respectively. For Output 2, the average RMSE decreased from 0.004273 for BP to 0.003444 for PSO-BP, while the average

R^{2}

increased from 0.315582 to 0.555325. For Output 3, PSO-BP again showed a marked advantage, reducing the average RMSE from 6.705057 to 1.946917 and increasing the average

R^{2}

from 0.684925 to 0.974274. Therefore, the predictive superiority of PSO-BP was maintained not only on the sixth-month temporal holdout set, but also on the later-period industrial validation data.

The observed-versus-predicted curves for the months 7–8 dataset validation set are shown in Figure 7, while the residual comparison is presented in Figure 8. Like the temporal holdout results, the PSO-BP curves follow the observed variation pattern more closely than the BP curves for all three outputs. The residual comparison also shows that the PSO-BP residual band is generally narrower, whereas the BP residuals are more dispersed across the sample range. This indicates that the prediction improvement associated with PSO-based initialization was not restricted to a single test period, but remained effective under chronological extrapolation to later industrial operation.

3.2. SHAP-Based Model Interpretation

To improve the interpretability of the PSO-BP model, a SHAP-based analysis framework was introduced to quantify the contribution of each input variable and to characterize the nonlinear response behavior learned by the surrogate model. In this study, the interpretability analysis was performed using the development dataset constructed from the first five months, whereas the sixth month and the months 7–8 datasets were reserved for out-of-sample validation. Accordingly, the purpose of this section is not to provide direct physicochemical proof of flotation mechanisms, but rather to examine whether the learned response patterns are consistent with engineering expectations and sufficiently informative for subsequent optimization.

3.2.1. Global Feature Importance

The global feature importance ranking based on mean absolute SHAP values is shown in Figure 9. For clean coal ash and clean coal sulfur, raw coal ash and collector dosage exhibit the highest contributions, indicating that feed property and reagent regime are the dominant factors controlling product quality in the present prediction framework. For tailing heat of combustion, collector dosage and frother dosage show the largest SHAP contributions, suggesting that reagent conditions play a more direct role in determining combustible matter loss in tailings [37]. Overall, the SHAP ranking reveals a clear distinction between feed-driven and operation-driven effects: raw coal ash mainly governs the baseline difficulty of separation, whereas collector and frother dosages constitute the primary controllable levers for process adjustment. This result provides an interpretable basis for selecting key decision variables in the subsequent optimization analysis.

3.2.2. Nonlinear Response Analysis

To further investigate the response behavior of the PSO-BP surrogate, partial dependence analysis was conducted for representative controllable variables, and the results are shown in Figure 10. The dependence curves indicate that the relationship between operating parameters and predicted clean coal ash is clearly nonlinear rather than monotonic. In particular, the collector dosage curve presents an initial mild decline followed by a noticeable upward trend at higher dosage levels, implying that the influence of collector dosage is characterized by a limited favorable region and a subsequent deterioration tendency under excessive addition. The frother dosage curve exhibits a similar nonlinear pattern, although the variation range is smaller [38]. In addition, the pulp solids mass fraction curve shows an overall increasing tendency within the investigated operating interval, suggesting that further increase in pulp solids mass fraction may weaken selectivity under the current process conditions. These results should be interpreted as model-identified nonlinear response intervals or transition regions, rather than as direct proof of physicochemical thresholds. Nevertheless, the overall trends are qualitatively consistent with flotation practice, in which reagent overdosage or excessive aeration may reduce separation selectivity.

3.2.3. Interaction Analysis

Because flotation performance is governed by coupled operating effects, single-variable interpretation alone is insufficient. Therefore, interaction surfaces between collector dosage and frother dosage were further examined, as shown in Figure 11. For clean coal ash, the response surface exhibits a distinct valley-like region, indicating that lower ash values are achieved only within a limited combination of collector and frother dosage rather than through monotonic increase in either variable alone. For tailing heat of combustion, the response surface shows a pronounced gradient and curvature, revealing that combustible matter loss is highly sensitive to coordinated reagent adjustment. These interaction patterns indicate that the flotation system contains evident coupling and trade-off behavior among key operating variables, and they help explain why empirical single-factor tuning often fails to achieve stable global improvement. More importantly, this interaction structure provides the interpretive foundation for the subsequent multi-objective optimization, because it demonstrates that improvement of one flotation objective may be accompanied by deterioration of another unless the operating variables are adjusted in a coordinated manner.

3.3. Multi-Objective Optimization via NSGA-II

To quantitatively characterize the trade-off among flotation objectives, multi-objective optimization was carried out using the NSGA-II algorithm based on the final PSO-BP surrogate model trained on the development dataset collected during the first five months. To preserve the independence of the chronological evaluation protocol, the sixth month and the months 7–8 dataset were reserved exclusively for temporal holdout and later-period validation, respectively, and were therefore not used in the optimization stage. The decision-variable bounds were defined according to the observed industrial operating domain of the development dataset, thereby constraining the search within practically reachable conditions rather than extrapolated regions. The population size, maximum generation number, crossover probability, mutation probability, and mutation step fraction were set to 100, 20, 0.8, 0.05, and 0.1, respectively. These settings provided a stable Pareto solution set in the present case.

3.3.1. Pareto Frontier Characteristics

The final Pareto-optimal set generated by NSGA-II is shown in Figure 12. The non-dominated solutions form a continuous frontier in the three-objective space defined by clean coal ash, clean coal sulfur, and tailing heat of combustion, rather than collapsing to a single isolated optimum. This indicates that the flotation problem is governed by coupled and conflicting objectives, such that no operating condition can simultaneously minimize all three responses. In addition, the Pareto solutions are distributed over a broad objective range, which suggests that the optimization maintained satisfactory diversity while converging toward feasible non-dominated regions. From an engineering perspective, this is important because the optimization does not simply return one mathematically extreme point, but instead provides a candidate set of feasible operating regimes spanning different compromises between concentrate quality and recovery-related performance.

Because all decision variables were bounded by the industrial data range of the first five months, the resulting Pareto solutions remain inside the observed process envelope. This means that the optimization results correspond to operationally reachable candidate conditions rather than unsupported extrapolated optima, which improves their relevance for subsequent engineering screening and decision-making.

3.3.2. Trade-Off Relationships Among Objectives

To further reveal the structure of the Pareto set, two-dimensional projections of the non-dominated solutions are shown in Figure 13. As shown in Figure 13a, clean coal ash and tailing heat of combustion exhibit a clear nonlinear trade-off. When clean coal ash is reduced toward the lower end of the Pareto frontier, tailing heat of combustion rises markedly, indicating that further product-quality improvement is accompanied by increasingly severe combustible matter loss. By contrast, when the ash target is moderately relaxed, the trade-off curve becomes flatter, suggesting the existence of a more balanced operating region in which quality and recovery can be jointly improved to a greater extent. Therefore, under the current process conditions, blindly pursuing extremely low ash is not necessarily favorable from the perspective of resource recovery.

Figure 13b shows the projection between clean coal ash and clean coal sulfur. The Pareto points display an evident co-varying tendency, indicating that ash reduction and sulfur reduction are not fully independent in the present flotation system. In practical terms, this means that operating conditions beneficial for ash control also tend to move sulfur in a consistent direction, whereas the more pronounced conflict is concentrated in the relationship between concentrate quality and tailing heat of combustion. This observation is useful because it clarifies that the central optimization difficulty in the present case lies not in balancing all quality indicators against one another, but in balancing quality improvement against recovery loss.

A one-at-a-time sensitivity analysis of the NSGA-II settings is provided in Figure 14. Within the tested range, the overall Pareto quality remained stable without collapse of the non-dominated search. The optimization results were more sensitive to population size and generation number than to crossover and mutation probabilities, whereas the latter two showed comparatively limited influence around the adopted baseline settings. These observations support the use of the selected NSGA-II parameter regime as a practical balance between search stability and computational cost.

3.4. Entropy-TOPSIS-Based Decision Making

Although NSGA-II provides a set of Pareto-optimal solutions, practical flotation control requires a unique and implementable operating recommendation. Therefore, an Entropy-TOPSIS decision-making strategy was introduced to convert the non-dominated solution set into a ranked candidate list. In this study, the three optimization objectives—clean coal ash, clean coal sulfur, and tailing heat of combustion—were all treated as cost-type indicators, because lower values are preferable for each of them within the present process context. Entropy weighting was used to determine the relative importance of the three objectives according to their dispersion within the Pareto set, thereby avoiding subjective assignment of weights. Subsequently, TOPSIS was employed to calculate the relative closeness of each Pareto solution to the ideal point, and all candidate solutions were ranked accordingly. This procedure establishes a deterministic link between the mathematical Pareto set and the final engineering recommendation.

3.4.1. Objective Weight Determination

The entropy-based objective weights are shown in Figure 15. Under the current Pareto solution set, the weights of clean coal ash, clean coal sulfur, and tailing heat of combustion are approximately 0.4579, 0.3095, and 0.2327, respectively. This result indicates that clean coal ash makes the largest contribution to differentiating candidate Pareto solutions, followed by clean coal sulfur, while tailing heat of combustion shows a lower but still non-negligible contribution. From an engineering perspective, this distribution is reasonable because ash remains the most critical quality indicator in the present flotation task, whereas sulfur serves as an additional product-quality constraint and tailing heat of combustion reflects recovery-related loss. Therefore, the entropy weighting result supports a decision logic in which product quality is prioritized, while recovery is still explicitly retained in the evaluation framework rather than ignored.

3.4.2. Composite Ranking of Pareto Solutions

Based on the derived entropy weights, a weighted normalized decision matrix was constructed and the TOPSIS relative closeness coefficient was calculated for each Pareto solution. The ranking results for the top candidate solutions are shown in Figure 16. The distribution indicates that the best solutions are not isolated numerical outliers, but belong to a relatively concentrated high-benefit region in the ranked solution space. Under the current Pareto set and decision model, the top-ranked candidate corresponds to Solution 50 with a relative closeness of approximately 0.8087. The predicted outputs of this candidate are about 7.8724% clean coal ash, 0.4724% clean coal sulfur, and 708.01 kJ kg⁻¹ tailing heat of combustion. This result suggests that the selected solution does not minimize any single objective absolutely, but instead achieves a more balanced compromise among product quality, sulfur control, and combustible matter loss than the surrounding alternatives.

3.4.3. Engineering Interpretation of the Recommended Solution

The role of Entropy-TOPSIS in the present framework is not to replace the Pareto frontier, but to provide a rational screening mechanism within it. NSGA-II defines the feasible compromise space, whereas Entropy-TOPSIS identifies the solution that is most balanced under the adopted multi-criteria decision logic. In this sense, the final recommended operating condition should be understood as a decision-supported engineering setpoint rather than as an absolute global optimum. This distinction is important because flotation control in industrial practice is inherently conditional on current production priorities. Under the weighting pattern obtained in this study, the selected solution reflects a “quality-priority with explicit recovery consideration” strategy, which is more practical than selecting a mathematically extreme point based on a single indicator alone. Accordingly, the Entropy-TOPSIS stage provides the final bridge from surrogate-based optimization to actionable operating recommendations.

3.5. Engineering Validation

3.5.1. Geometric Verification

To further examine the engineering rationality of the decision result, the top-ranked candidate identified by the Entropy-TOPSIS model was mapped back onto the Pareto-optimal frontier, as shown in Figure 17. The selected solution is in the low-objective region of the Pareto set and remains close to the curved knee-like zone rather than at an isolated extreme endpoint. This geometric position indicates that the recommendation is not simply the mathematical minimizer of one single objective, but a compromise solution balancing product quality and recovery-related loss within the non-dominated solution space. In this sense, the spatial consistency between the TOPSIS ranking result and the Pareto frontier structure supports the engineering rationality of the selected operating condition.

3.5.2. Historical DCS Benchmarking

To assess the practical significance of the recommended operating condition, it was benchmarked against the historical DCS average under conventional empirical control, as shown in Figure 18. At the current stage, this comparison should be regarded as an offline engineering benchmark rather than as a completed prospective field trial. Under historical operation, the tailing heat of combustion remained at approximately 780 kJ kg⁻¹, reflecting relatively conservative operating practice and incomplete recovery of combustible matter. In contrast, the recommended setpoint obtained from the integrated PSO-BP–NSGA-II–Entropy-TOPSIS framework yielded a substantially lower predicted tailing heat of combustion while maintaining clean coal ash within the target quality range. This result indicates that the proposed strategy has the potential to reduce combustible matter loss without sacrificing concentrate quality.

Because the current evidence is based on surrogate-assisted benchmarking rather than direct on-site intervention, the above improvement should be interpreted as a quantitatively supported engineering potential rather than as a completed industrial implementation outcome. Nevertheless, the comparison demonstrates that the proposed framework can convert a mathematical Pareto set into an operationally meaningful DCS recommendation, which is precisely the gap that conventional multi-objective optimization studies often fail to bridge.

3.5.3. Practical Implications and Limitations

Taken together, the geometric verification and the historical DCS benchmark comparison indicate that the recommended operating condition is not only mathematically favorable, but also practically meaningful from the perspective of quality–recovery coordination. More importantly, this section shows how the proposed framework translates a sequence of modeling, interpretation, optimization, and decision-making steps into an engineering recommendation that can potentially be used for DCS-assisted process regulation.

At the same time, the present result still has a clear limitation: it is based on offline benchmarking rather than prospective plant validation. Therefore, the next step should be to conduct controlled on-site implementation of the recommended setpoint, monitor the resulting changes in clean coal ash, sulfur, and tailing heat of combustion over a defined operating window, and compare them against the historical DCS baseline under matched feed conditions. Such prospective validation would provide a stronger basis for assessing the real industrial effectiveness of the proposed strategy.

In addition, the robustness of the Entropy-TOPSIS decision result was examined by perturbing the entropy-derived weights. As shown in Figure 19, the top-ranked recommendation remained concentrated within a highly stable candidate region under moderate weight perturbation, indicating that the final decision was not excessively sensitive to small changes in objective weighting. This result improves the credibility of the recommended operating condition under realistic multi-criteria uncertainty.

4. Conclusions

This study developed an interpretable data-driven framework for coking coal flotation by integrating PSO-BP prediction, SHAP-based interpretation, NSGA-II multi-objective optimization, and Entropy-TOPSIS decision-making.

The results show that the PSO-BP model consistently outperformed the conventional BP model under both temporal holdout testing and later-period industrial validation, indicating improved predictive reliability and temporal generalization under the adopted chronological evaluation protocol. SHAP-based analysis further enhanced model transparency by identifying the dominant variables, nonlinear response regions, and coupled operating effects, thereby providing an interpretable basis for subsequent optimization without overclaiming direct physicochemical verification.

Based on the final PSO-BP surrogate, NSGA-II successfully quantified the trade-off among clean coal ash, clean coal sulfur, and tailing heat of combustion within the observed industrial operating domain, while Entropy-TOPSIS converted the Pareto-optimal solution set into a practically selectable operating recommendation. The additional sensitivity and robustness analyses further indicated acceptable stability of both the optimization process and the final decision result.

Compared with more complex deep-learning strategies, the present framework is more compact and interpretable for structured industrial process variables and limited plant datasets, and it shows practical potential for generating data-driven operating recommendations under the current production conditions. However, its effectiveness still depends on the representativeness of the available DCS dataset and has not yet been verified through prospective on-site implementation. Future work should therefore focus on controlled plant validation under matched feed conditions to further assess the industrial effectiveness of the proposed framework.

Author Contributions

Y.W.: Conceptualization, Methodology, Writing—Original Draft. D.C.: Software; Validation; Formal analysis; Data curation; Visualization; Writing—review & editing. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

Data will be made available on request.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Ren, S.; Jiao, X.; Zheng, D.; Zhang, Y.; Xie, H.; Zhang, R. Impact of carbon neutrality goals on China’s coal industry: Mechanisms and evidence. Energies 2025, 18, 1672. [Google Scholar] [CrossRef]
Chandran, G.; Ashish, S.; Selvaraj, T. Selection criteria for mine tailings as SCM: A Comprehensive Review of Types, Properties and Performance. Miner. Eng. 2026, 235, 109822. [Google Scholar] [CrossRef]
McCoy, J.T.; Auret, L. Machine learning applications in minerals processing: A review. Miner. Eng. 2019, 132, 95–109. [Google Scholar] [CrossRef]
Lartey, C.; Asamoah, R.K.; Greet, C.; Zanin, M.; Liu, J. An interpretable and generalised machine learning model for predicting flotation performance. Miner. Eng. 2025, 232, 109492. [Google Scholar] [CrossRef]
Tan, J.; Liang, L.; Peng, Y.; Xie, G. Challenges of using froth features to predict clean coal ash content in coal flotation. Int. J. Coal Prep. Util. 2022, 42, 1991–2027. [Google Scholar] [CrossRef]
Kong, L.; Bai, J.; Li, H.; Chen, X.; Wang, J.; Bai, Z.; Guo, Z.; Li, W. The mineral evolution during coal washing and its effect on ash fusion characteristics of Shanxi high ash coals. Fuel 2018, 212, 268–273. [Google Scholar] [CrossRef]
Ali, D.; Hayat, M.B.; Alagha, L.; Molatlhegi, O.K. An evaluation of machine learning and artificial intelligence models for predicting the flotation behavior of fine high-ash coal. Adv. Powder Technol. 2018, 29, 3493–3506. [Google Scholar] [CrossRef]
Zhang, W.; Ping, A.; Peng, Y.; Bilal, M.; Hassan, F.U.; Ni, C.; Dai, Z. Exploration of innovative pathways for the preparation of ultra-clean coal from low-rank coal: A study on multi-stage flotation processes and mechanisms based on reagent synergy. Int. J. Coal Prep. Util. 2026, 46, 423–450. [Google Scholar] [CrossRef]
Ji, Y.; Wang, L.; Ren, G.; Xu, T.; Li, X.; Chen, X.; Bu, X.; Sha, J. Evaluation of parametric effects of coal flotation based on boosting modeling method. Physicochem. Probl. Miner. Process. 2024, 60, 196385. [Google Scholar] [CrossRef]
Melo, F.; Laskowski, J. Fundamental properties of flotation frothers and their effect on flotation. Miner. Eng. 2006, 19, 766–773. [Google Scholar] [CrossRef]
Cho, Y.-S.; Laskowski, J. Effect of flotation frothers on bubble size and foam stability. Int. J. Miner. Process. 2002, 64, 69–80. [Google Scholar] [CrossRef]
Li, J.; Zhang, D.; Xia, Y.; Avid, B.; Long, L.; Ping, Y.; Piao, Y.; Xing, Y.; Gui, X. Machine learning-based analysis and prediction of coal flotation process parameters mechanisms. Int. J. Coal Prep. Util. 2025, 1–19. [Google Scholar] [CrossRef]
Pawliszak, P.; Bradshaw-Hajek, B.H.; Skinner, W.; Beattie, D.A.; Krasowska, M. Frothers in flotation: A review of performance and function in the context of chemical classification. Miner. Eng. 2024, 207, 108567. [Google Scholar] [CrossRef]
Nakhaei, F.; Irannajad, M. Application and comparison of RNN, RBFNN and MNLR approaches on prediction of flotation column performance. Int. J. Min. Sci. Technol. 2015, 25, 983–990. [Google Scholar] [CrossRef]
Pu, Y.; Szmigiel, A.; Chen, J.; Apel, D.B. FlotationNet: A hierarchical deep learning network for froth flotation recovery prediction. Powder Technol. 2020, 375, 317–326. [Google Scholar] [CrossRef]
Li, Y.; Liu, H.; Lu, F. Research on prediction of ash content in flotation-recovered clean coal based on NRBO-CNN-LSTM. Minerals 2024, 14, 894. [Google Scholar] [CrossRef]
Liao, L.; Huang, X.; Zhang, H.; Shang, H.; Cao, Z.; Zhang, J. FEPNet: A feature extraction-prediction network for coal flotation froth image segmentation. Signal Image Video Process. 2025, 19, 549. [Google Scholar] [CrossRef]
Wen, Z.; Zhou, C.; Pan, J.; Nie, T.; Zhou, C.; Lu, Z. Deep learning-based ash content prediction of coal flotation concentrate using convolutional neural network. Miner. Eng. 2021, 174, 107251. [Google Scholar] [CrossRef]
Liu, Z.; Li, L.; Zeng, J.; Wang, Y.; Yang, J.; Liu, X. Predicting the ash content in coal flotation concentrate based on convolutional neural network. Int. J. Coal Prep. Util. 2024, 44, 2080–2096. [Google Scholar] [CrossRef]
Yang, X.; Zhang, K.; Thé, J.; Tan, Z.; Yu, H. Multi-scale neural network for accurate determination of the ash content of coal flotation concentrate using froth images. Expert Syst. Appl. 2025, 262, 125614. [Google Scholar] [CrossRef]
Liu, Q.; Wang, L.; Xing, Y.; Han, Y.; Liu, J.; Dai, S.; Yu, G.; Gui, X. Prediction of ash content in coal slime flotation based on CNN-BP method with residual estimation. Int. J. Coal Prep. Util. 2025, 45, 97–111. [Google Scholar] [CrossRef]
Lu, F.; Liu, H.; Lv, W. Deep correlation and precise prediction between static features of froth images and clean coal ash content in coal flotation: An investigation based on deep learning and maximum likelihood estimation. Measurement 2024, 224, 113843. [Google Scholar] [CrossRef]
Yang, T.; Zhang, Z.; Chen, P.; Gui, D.; Li, Y. Study on the prediction of cleaning coal ash content based on principal component analysis and machine learning. Int. J. Coal Prep. Util. 2025, 1–17. [Google Scholar] [CrossRef]
Shahbazi, B.; Chelgani, S.C.; Matin, S. Prediction of froth flotation responses based on various conditioning parameters by Random Forest method. Colloids Surf. A Physicochem. Eng. Asp. 2017, 529, 936–941. [Google Scholar] [CrossRef]
Zhang, W.; Yuan, Q.; Jia, S.; Li, Z.; Yin, X. Multi-objective optimization of forth flotation process: An application in gold ore. Sustainability 2021, 13, 8314. [Google Scholar] [CrossRef]
Liang, Y.; He, D.; Wang, Q.; Lu, X. Fuzzy distributional chance-constrained programming for handling stochastic and epistemic uncertainties during flotation processes. Chem. Eng. Res. Des. 2020, 164, 248–260. [Google Scholar] [CrossRef]
Fan, Y.; Lv, Z.; Chen, S.; Cui, Y.; Wu, Y.; Zhao, X.; Xu, Z.; Wang, W. Condition recognition based on multi-source heterogeneous data and residual temporal network in coal flotation process. Meas. Sci. Technol. 2025, 36, 036002. [Google Scholar] [CrossRef]
Wang, Z.; Zhang, X.; He, D. Dynamic global feature extraction and importance-correlation selection for the prediction of concentrate copper grade and recovery rate. Can. J. Chem. Eng. 2023, 101, 2598–2610. [Google Scholar] [CrossRef]
Cao, W.; Wang, R.; Fan, M.; Fu, X.; Wang, Y.; Guo, Z.; Fan, F. Froth image clustering with feature semi-supervision through selection and label information. Int. J. Mach. Learn. Cybern. 2021, 12, 2499–2516. [Google Scholar] [CrossRef]
Bai, L.; Song, W.; Bu, X. Evaluation of input parameters of feed properties and operation variables in coal flotation using machine learning model and SHAP analysis. Int. J. Coal Prep. Util. 2025, 1–12. [Google Scholar] [CrossRef]
Zhang, H.; Niu, F.; Zhang, J.; Yu, X. Prediction of three-dimensional fractal dimension of hematite flocs based on particle swarm optimization optimized back propagation neural network. Min. Metall. Explor. 2022, 39, 2503–2515. [Google Scholar] [CrossRef]
Zhou, X.; Li, M.; Du, Y.; Yang, C.; Wen, S. Interpretable multiobjective feature selection via visualization in froth flotation process. IEEE Trans. Ind. Inform. 2024, 21, 2530–2539. [Google Scholar] [CrossRef]
Yu, G.; Chai, T.; Luo, X. Multiobjective production planning optimization using hybrid evolutionary algorithms for mineral processing. IEEE Trans. Evol. Comput. 2011, 15, 487–514. [Google Scholar] [CrossRef]
Wang, L.; Liu, J.; Sun, Z.; Wang, H.; Nan, J.; Gui, X.; Dai, W. Intelligent real-time ash content detection for coal flotation concentrate using multi-source data fusion. Fuel 2026, 406, 137132. [Google Scholar] [CrossRef]
Szmigiel, A.; Apel, D.B.; Skrzypkowski, K.; Wojtecki, L.; Pu, Y. Advancements in machine learning for optimal performance in flotation processes: A review. Minerals 2024, 14, 331. [Google Scholar] [CrossRef]
Tang, J.; Hou, B.; Li, Z.; Liu, J.; Wang, Z.; Shu, J.; Ren, B.; Wang, C.; Deng, R.; Kuang, Y. Adsorption-catalytic synergistic Fenton degradation of potassium butyl xanthate in flotation tailing wastewater by renewable iron-loaded sludge: Performance, kinetics and mechanism. Sep. Purif. Technol. 2025, 359, 130533. [Google Scholar] [CrossRef]
Nazari, S.; Hassanzadeh, A.; He, Y.; Khoshdast, H.; Kowalczuk, P.B. Recent developments in generation, detection and application of nanobubbles in flotation. Minerals 2022, 12, 462. [Google Scholar] [CrossRef]
Chelgani, S.C.; Shahbazi, B.; Hadavandi, E. Support vector regression modeling of coal flotation based on variable importance measurements by mutual information method. Measurement 2018, 114, 102–108. [Google Scholar] [CrossRef]

Figure 1. Schematic diagram of the flotation principle and process, indicating the main manipulated variables, the feed-related state variable (raw coal ash), and representative hydrodynamic descriptors including bubble diameter and slurry residence time.

Figure 2. Flowchart of PSO optimizing BP.

Figure 3. PSO-BP network architecture and topology-sensitivity analysis. The left panel shows the adopted 5-11-3 neural-network structure, and the right panel shows the sensitivity analysis of hidden-layer neurons. Based on the joint evaluation of (root mean square error) RMSE, standard deviation, and training time, 11 hidden neurons were selected for subsequent modeling.

Figure 4. Flowchart of NSGA-II optimization.

Figure 5. Comparison between observed values and model predictions on the Month 6 temporal holdout set: (a) clean coal ash, (b) clean coal sulfur, and (c) tailing heat of combustion.

Figure 6. Residual comparison between PSO-BP and BP on the Month 6 temporal holdout set: (a) clean coal ash, (b) clean coal sulfur, and (c) tailing heat of combustion.

Figure 7. Comparison between observed values and model predictions on the independent later-period validation set collected during the months 7–8 dataset: (a) clean coal ash, (b) clean coal sulfur, and (c) tailing heat of combustion.

Figure 8. Residual comparison between PSO-BP and BP on the independent later-period validation set collected during the months 7–8 dataset: (a) clean coal ash, (b) clean coal sulfur, and (c) tailing heat of combustion.

Figure 9. Global feature importance ranking based on mean absolute SHAP values for the PSO-BP model: (a) clean coal ash, (b) clean coal sulfur, and (c) tailing heat of combustion.

Figure 10. Partial dependence plots showing the model-estimated marginal response of clean coal ash to key controllable variables: (a) collector dosage, (b) frother dosage, and (c) Pulp solids mass fraction.

Figure 11. Interaction surfaces showing the coupled effects of collector dosage and frother dosage on (a) clean coal ash and (b) tailing heat of combustion.

Figure 12. Pareto-optimal frontier in the three-objective space of clean coal ash, clean coal sulfur, and tailing heat of combustion, showing the continuous non-dominated solution set generated by NSGA-II.

Figure 13. Two-dimensional projection analysis of the Pareto frontier: (a) relationship between clean coal ash and tailing heat of combustion, showing the trade-off between product quality and combustible matter loss; (b) relationship between clean coal ash and clean coal sulfur, indicating their co-varying tendency under the present flotation conditions.

Figure 14. One-at-a-time sensitivity analysis of the NSGA-II parameter settings: (a) population size, (b) maximum generation number, (c) crossover probability, and (d) mutation probability. The vertical bars indicate the variation range across repeated runs, and the approximate hypervolume is used as an aggregate indicator of Pareto-front quality.

Figure 15. Entropy-based objective weights of clean coal ash, clean coal sulfur, and tailing heat of combustion calculated from the Pareto-optimal solution set.

Figure 16. TOPSIS ranking of the leading Pareto-optimal candidates based on the entropy-weighted relative closeness coefficient. The highlighted bar denotes the top-ranked solution selected as the recommended operating condition.

Figure 17. Geometric verification of the selected operating condition.

Figure 18. Engineering benchmark comparison between the recommended operating condition and historical DCS operation: (a) selected quality indicators relative to the target quality range; (b) comparison of tailing heat of combustion, showing the predicted reduction in combustible matter loss under the recommended setpoint.

Figure 19. Robustness analysis of the Entropy-TOPSIS decision result under entropy-weight perturbation. The distribution shows that the top-ranked recommendation remained concentrated within a stable candidate region under repeated perturbations of the objective weights.

Table 1. Variable system and operational constraints.

Category		Variable	Unit	Selection Criteria
Input	Independent variable	collector dosage	g kg⁻¹	Essential for hydrophobicity. Low dosage reduces yield; excess increases cost and slime.
		frother dosage	g kg⁻¹	Controls bubble size and foam stability.
		pulp solids mass fraction	%	Affects collision efficiency and viscosity.
		air volumetric flow rate	L h⁻¹	Influences bubble surface area and fluid dynamics.
	State variable	raw coal ash mass fraction	%	Key feed-forward variable. Major source of process fluctuation.
Output		clean coal ash mass fraction	%	Core quality indicator for product grade.
		clean coal sulfur mass fraction	%	Key for coke quality and environmental compliance.
		tailing heat of combustion	kJ kg⁻¹	Indicates combustible material loss in tailings.

Table 2. Statistical description of the dataset used for modeling.

Variable Category	Variable Name	Symbol	Min	Max	Mean	Std. Dev.
Inputs	collector dosage	$D_{c}$	0.28	1.58	0.93	0.2
	frother dosage	$D_{f}$	0.04	0.26	0.15	0.03
	pulp solids mass fraction	$w_{S, p u l p}$	8.01	16.99	12.51	1.25
	air volumetric flow rate	$q_{V}$	180.12	419.88	300.24	34.62
	raw coal ash mass fraction	$w_{A, r a w}$	16.03	29.98	23.01	2.01
Outputs	clean coal ash mass fraction	$w_{A, c l e a n}$	7.74	11.1	9.38	0.82
	clean coal sulfur mass fraction	$w_{S, c l e a n}$	0.4	0.76	0.53	0.05
	tailing heat of combustion	$e_{t a i l}$	512.35	1 005.2	625.4	128.65

Table 3. Month 6 repeated results.

Output	Model	Mean Absolute Error (MAE)_Mean	MAE_Std	Root Mean Square Error (RMSE)_Mean	RMSE_Std	$Coefficient of Determination (R^{2}$ )_Mean	$R^{2}$ _Std
Output 1	PSO-BP	0.000 474 260	0.000 013 803	0.008 585 148	0.000 207 235	0.981 115 788	0.000 916 568
Output 1	BP	0.001 921 633	0.000 276 847	0.033 322 195	0.004 300 635	0.711 394 120	0.075 177 226
Output 2	PSO-BP	0.000 097 025	0.000 002 820	0.001 754 287	0.000 062 930	0.789 747 596	0.015 075 175
Output 2	BP	0.000 152 248	0.000 018 844	0.002 660 838	0.000 294 748	0.511 523 634	0.109 931 004
Output 3	PSO-BP	0.067 529 587	0.002 239 208	1.244 302 549	0.041 141 024	0.988 214 031	0.000 785 947
Output 3	BP	0.341 528 579	0.096 839 655	5.817 046 349	1.524 352 480	0.726 765 752	0.145 879 074

Table 4. Months 7–8 repeated results.

Output	Model	MAE_Mean	MAE_Std	RMSE_Mean	RMSE_Std	$R^{2}$ _Mean	$R^{2}$
Output 1	PSO-BP	0.000 767 093	0.000 012 708	0.013 448 062	0.000 197 033	0.959 661 750	0.001 176 528
Output 1	BP	0.002 048 054	0.000 310 522	0.035 418 499	0.004 643 976	0.715 918 941	0.073 841 187
Output 2	PSO-BP	0.000 191 802	0.000 007 953	0.003 444 482	0.000 141 304	0.555 324 661	0.036 758 415
Output 2	BP	0.000 246 027	0.000 013 198	0.004 273 455	0.000 170 932	0.315 582 351	0.054 673 178
Output 3	PSO-BP	0.108 747 848	0.003 686 913	1.946 917 250	0.072 472 469	0.974 273 898	0.001 926 444
Output 3	BP	0.400 538 256	0.086 632 958	6.705 056 799	1.300 978 729	0.684 925 016	0.125 834 089

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Wang, Y.; Cui, D. Interpretable Data-Driven Prediction, Optimization, and Decision-Making for Coking Coal Flotation. Processes 2026, 14, 1289. https://doi.org/10.3390/pr14081289

AMA Style

Wang Y, Cui D. Interpretable Data-Driven Prediction, Optimization, and Decision-Making for Coking Coal Flotation. Processes. 2026; 14(8):1289. https://doi.org/10.3390/pr14081289

Chicago/Turabian Style

Wang, Ying, and Deqian Cui. 2026. "Interpretable Data-Driven Prediction, Optimization, and Decision-Making for Coking Coal Flotation" Processes 14, no. 8: 1289. https://doi.org/10.3390/pr14081289

APA Style

Wang, Y., & Cui, D. (2026). Interpretable Data-Driven Prediction, Optimization, and Decision-Making for Coking Coal Flotation. Processes, 14(8), 1289. https://doi.org/10.3390/pr14081289

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Interpretable Data-Driven Prediction, Optimization, and Decision-Making for Coking Coal Flotation

Abstract

1. Introduction

2. Methodology

2.1. Process Description and Dataset Construction

2.1.1. Variable System Definition

2.1.2. Data Preprocessing and Multidimensional Sampling Strategy

2.2. PSO-BP Model Development

2.2.1. Network Architecture and Topology Selection

2.2.2. PSO-Based Parameter Initialization and Supervised Training

2.2.3. Data Partitioning and Normalization

2.3. Interpretability Analysis Based on SHAP

2.4. Multi-Objective Optimization and Decision Making

2.4.1. Multi-Objective Optimization via NSGA- II

2.4.2. Decision Making Based on Entropy-TOPSIS

3. Results and Analysis

3.1. Predictive Performance of the PSO-BP Model

3.1.1. Temporal Holdout Validation

3.1.2. Later-Period Industrial Validation

3.2. SHAP-Based Model Interpretation

3.2.1. Global Feature Importance

3.2.2. Nonlinear Response Analysis

3.2.3. Interaction Analysis

3.3. Multi-Objective Optimization via NSGA-II

3.3.1. Pareto Frontier Characteristics

3.3.2. Trade-Off Relationships Among Objectives

3.4. Entropy-TOPSIS-Based Decision Making

3.4.1. Objective Weight Determination

3.4.2. Composite Ranking of Pareto Solutions

3.4.3. Engineering Interpretation of the Recommended Solution

3.5. Engineering Validation

3.5.1. Geometric Verification

3.5.2. Historical DCS Benchmarking

3.5.3. Practical Implications and Limitations

4. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI