Next Article in Journal
Cellular- to Plant-Scale Techno-Economic and Strain Design Analysis for Batch and Fed-Batch Process with the DySEEP Framework
Previous Article in Journal
Study on the Influence of Alkane C Chain Length on Coal Slime Flotation Based on Interfacial Thermodynamic Analysis and Characterization
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Intelligent Diagnosis Method for Constrained Primary Frequency Regulation Capacity of Coal-Fired Units Based on ISO-MLRF

1
School of Energy, Power and Mechanical Engineering, North China Electric Power University, Beijing 102206, China
2
State Grid Zhejiang Electric Power Research Institute, Hangzhou 310014, China
3
State Grid Zhejiang Electric Power Co., Ltd., Hangzhou 310007, China
4
School of Control and Computer Engineering, North China Electric Power University, Beijing 102206, China
5
E.Energy Technology Co., Ltd., Hangzhou 310018, China
*
Authors to whom correspondence should be addressed.
Processes 2026, 14(10), 1658; https://doi.org/10.3390/pr14101658
Submission received: 21 April 2026 / Revised: 12 May 2026 / Accepted: 15 May 2026 / Published: 20 May 2026
(This article belongs to the Special Issue Design and Optimization of Heat Engines and Thermal Power Plants)

Abstract

To address the challenges of low diagnostic accuracy of constrained primary frequency regulating (PFR) capacity for coal-fired units due to complex and strongly coupled restricting factors, an intelligent diagnosis method based on an improved snake optimizer-based multi-label random forest classification algorithm is proposed. By analyzing the factors restricting PFR capability, a set of characterization parameters and constraint factors for unit regulating capacity is established. The snake optimizer is enhanced by introducing dynamic update mechanisms and novel search strategies to improve its convergence speed and accuracy. The improved algorithm is then applied to optimize the hyperparameters of the multi-label random forest algorithm, enabling online diagnosis of PFR capacity limitations. Simulation results demonstrate that the proposed algorithm exhibits superior convergence performance, with lower medians of false alarm rate and missing alarm rate across all labels, coupled with reduced result dispersion compared to alternative algorithms. Tests on real operational data show an average false alarm rate of 0.029% and an average missing alarm rate of 0.053 for all labels. The results indicate that the proposed method is feasible and effective, enabling accurate online diagnosis of constrained PFR capacity of coal-fired units.

1. Introduction

The increasing integration of high proportions of renewable energy and the widespread application of power electronic devices have led to reduced system inertia, weakened disturbance resistance, and accelerated frequency change rates, making the maintenance of system frequency stability an unprecedented challenge [1]. In grid frequency regulation tasks, PFR provides an automatic and rapid response to frequency fluctuations caused by load changes. It is the most fundamental and fastest automatic mechanism for maintaining real-time power balance and frequency stability in power systems. As the proportion of renewable energy sources further increases in the new electricity system, coal-fired power generation units are transitioning from the main power source to supporting and regulating sources, and undertaking grid frequency regulation tasks more frequently [2].
In actual operation, due to complex factors such as boiler inertia, regulation system performance, operating conditions, and equipment health status, many units cannot meet grid requirements for PFR in terms of response time, droop coefficient, actual action integral electricity, and other regulation capabilities, thereby affecting grid frequency stability [1,3]. Additionally, units with constrained regulation capabilities are subject to assessments under the power grid’s “Two Detailed Rules,” leading to economic losses for power plants.
Currently, there is constrained research on comprehensive diagnostic analysis of the constrained regulation capabilities of coal-fired units in PFR. Existing studies mainly focus on modeling the PFR process for coal-fired units and analyzing some constraining factors. In terms of mechanistic models, ref. [4] analyzes the impact of main steam pressure fluctuations on PFR, incorporating a boiler dynamic model that includes heat exchange processes into a generic unit PFR model, thereby improving model accuracy. Reference [5] establishes a mathematical model for extraction condensing steam turbines applicable to various PFR technologies and operating condition changes, analyzing the effects of different frequency regulation technologies and condition variations on unit PFR capabilities. Reference [6] examines the limiting mechanisms and influencing factors of unit PFR capabilities from three aspects: main steam valve opening, main steam pressure, and power limits, and builds an assessment model for PFR capabilities of deep peak-shaving units. Reference [7] establishes a unit PFR capability assessment model based on system identification technology, analyzing the decoupling of main steam pressure from unit power response. Reference [8] studies the influence of patterns of boiler heat storage coefficients and main steam pressure on unit PFR capabilities, establishing a PFR mechanistic model based on observer PID. In terms of data-driven models, ref. [9] develops a data-driven model for predicting PFR capabilities, with inputs including load commands, actual power, main steam pressure, valve opening, and rotational speed, aiding in the analysis of limiting factors for unit PFR capabilities. Reference [10] creates a fully data-driven model for online estimation of PFR capabilities in deep peak-shaving thermal power units using an LSTM neural network, enabling the assessment of unit PFR capabilities. While the aforementioned models for coal-fired unit PFR can identify deviations in unit PFR capabilities and analyze some limiting factors, they cannot achieve a comprehensive diagnosis of constrained PFR capabilities.
Thermal power units are highly complex nonlinear systems with strongly coupled dynamic variables, making it difficult to comprehensively analyze the reasons for constrained PFR capabilities through mechanistic modeling. Compared to mechanistic modeling approaches, data-driven methods can achieve comprehensive analysis of unit regulation capability constraints without prior knowledge [11]. Due to the strong correlations among unit equipment and the multiple, highly coupled causes of constrained PFR capabilities, each reason for constrained regulation can be treated as a label. Thus, the problem of diagnosing constrained regulation capabilities in coal-fired units can be transformed into a multi-label classification problem. Currently, methods for solving multi-label classification problems are mainly divided into two categories: problem transformation and algorithm adaptation. Mainstream algorithms in problem transformation include binary relevance, classifier chains [12], and label powerset [13]. Algorithm adaptation methods mainly include K-nearest neighbors (KNN) [14], random forests (RF) [15], gradient boosting, support vector machines (SVM), neural networks, etc. Among these, random forest is an ensemble classification model based on decision trees, with few hyperparameters and no need for extensive parameter tuning. Therefore, random forest is adopted as the diagnostic classification algorithm in this paper. Random forest is an ensemble of decision trees, and the number of decision trees, their depth, and other hyperparameters affect the final classification diagnostic results. Optimizing the hyperparameters of a random forest can improve the accuracy of classification diagnosis. Common algorithms for hyperparameter optimization include particle swarm optimization (PSO) [16], gray wolf optimizer (GWO) [17], genetic algorithm (GA) [18], snake optimizer (SO) [19], and Bayesian algorithm (BYS) [20].
To benchmark performance against state-of-the-art advancements, the snake optimizer (SO) proposed by Hussien [21] is incorporated. This selection is predicated on SO’s distinctive dynamic exploration-exploitation mechanism, which epitomizes the latest evolutionary trends in swarm intelligence. However, despite its efficacy, SO often faces challenges in convergence speed and stability when tackling high-dimensional feature selection for diagnostic tasks. Therefore, to address these limitations and further enhance model transparency, this study proposes a novel hybrid framework, ISOMLRF. By integrating the strengths of the improved snake optimizer with random forest, our method aims to achieve a superior balance between diagnostic accuracy and computational efficiency.
In summary, high renewable integration intensifies the demand for precise PFR; coal-fired units often fail to meet grid codes due to complex, coupled constraints. Existing diagnostic approaches face limitations: mechanism models struggle with system nonlinearity, while current data-driven methods lack comprehensive multi-factor analysis. To bridge this gap, this study reframes PFR diagnosis as a multi-label classification task. The main contributions of this paper are
A comprehensive set of parameters and constraint factors is systematically analyzed and established to address the strong coupling effects in coal-fired units.
An improved snake optimizer featuring dynamic update mechanisms and novel search strategies is proposed to optimize the hyperparameters of a multi-label random forest, significantly enhancing convergence speed and diagnostic accuracy.
The proposed method is validated using real-world operational data, demonstrating superior performance with low FAR and MAR, providing high-precision online diagnosis for constrained PFR capacity.
The remainder of this paper is organized as follows. Section 2 presents the mechanism analysis of constrained PFR capacity. Section 3 provides a detailed description of the proposed fault diagnosis method. Section 4 presents the simulation analysis of the proposed method. Section 5 concludes the paper and presents the limitations of the proposed study.

2. Analysis of Constrained PFR Capabilities

2.1. The Constrained Factors of PFR Capabilities

Based on refs. [22,23,24] and on-site expert experience, the main reasons for the constrained PFR capability of coal-fired units are illustrated in Table 1.
The reasons can be summarized into the following categories: operational condition limitations, boiler heat storage capacity limitations, poor governor system performance, improper control system parameter settings, limitations of auxiliary equipment and supporting systems (for units adopting new PFR performance enhancement technologies), equipment aging issues, and others.

2.2. Characterization Parameters Selection of PFR Capability Diagnosis

The PFR capability of a coal-fired unit primarily encompasses three aspects: speed, capacity, and stability [25,26]. Speed refers to the unit’s responsiveness to grid frequency changes, characterized by parameters such as response delay time (Td), time to reach 75% of the target load response (T75), and time to reach 90% of the target load response (T90). Capacity refers to the regulation capacity during the unit’s PFR process, characterized by parameters including: PFR load response (ΔP), speed deadband (i), and speed governing droop (Kp). Stability is primarily characterized by parameters such as settling time (Ts), contribution energy (W), and overshoot (σ). To meet the needs of diagnosing constrained frequency regulation capability, operational state characterization parameters during the PFR process are also introduced here, including valve actuation rate, main steam pressure change rate, and intermediate point temperature change rate. The main characterization parameters for a unit’s PFR capability are shown in Table 2.

3. Methodology

3.1. Snake Optimizer Algorithm

In 2022, Professor Abdelazim G. Hussien drew inspiration from the predatory, combat, and mating behaviors of snakes in nature to pioneer the snake optimizer (SO) algorithm. According to [21], SO equally divides the snake population into two groups: male and female, as shown in Equation (1). Females only fight or mate with males when the temperature (Temp) is low, and the food quantity (Q) is sufficient. Temp and Q are given by Equations (2) and (3), respectively.
N m N / 2 , N f = N N m
T e m p = exp t T
Q = c 1 exp t T T
Among them, N represents the total number of individuals, while Nm and Nf denote the number of male and female individuals, respectively. T and t denote the maximum number of iterations and the current iteration number, respectively, and c1 takes the value of 0.5. Then, the positions of male snakes can be updated as follows [21].
X i , m t + 1 = X rand , m t ± c 2 exp f rand , m f i , m X max X min r a n d + X min
Here, Xi,m and Xrand,m represent the positions of male snakes, with their fitness values denoted as fi,m and frand,m. Xmax and Xmin represent the upper and lower bounds of the positions, rand is a random number between 0 and 1, and c2 is set to 0.05. The position update method for female snakes is the same as that for male snakes.
If the food quantity is sufficient (Q > 0.25), the snake population enters the exploration stage. If Temp > 0.6, the snake population will move entirely toward the direction of the food according to the rule specified in Equation [21].
X i , j t + 1 = X food ± c 3 T e m p r a n d X food X i , j t
where Xi,j and Xfood represent the position of the snake (male or female) and the position of the best individual, respectively, and c3 is set to 2.
If the environmental temperature is low (Temp < 0.6), the snakes enter either combat or mating mode. The combat mode for males is described by Equation (6). Conversely, the mating mode is expressed by Equation (7). Finally, the worst-performing males and females are updated using Equations (8) and (9), and the algorithm continues to run until the termination conditions are met [21].
X i , m t + 1 = X i , m t + c 3 exp f best , f f i r a n d Q X best , f X i , m t
X i , m t + 1 = X i , m t + c 3 exp f i , f f i , m r a n d Q X i , f t X i , m t
X worst , m = X min + r a n d X max X min
X worst , f = X min + r a n d X max X min
where Xbest,f represents the position of the best female snake. fbest,f and fi represent the fitness values of the best female snake and the snake population, respectively. The update process for female snakes is the same as that for male snakes. Xworst,m and Xworst,f correspond to the positions of the worst-performing male and female snakes, respectively.

3.2. Improved Snake Optimizer Algorithm

Although the snake optimizer (SO) algorithm has made significant progress compared to previous algorithms, it still faces considerable challenges when dealing with complex and high-dimensional optimization problems. Due to insufficient population diversity in the SO algorithm, it encounters issues such as slow convergence speed and low convergence accuracy when solving complex optimization problems in engineering applications that involve different dimensions and multiple nonlinear constraints. To address these issues, this paper introduces dynamic update and search mechanisms to enhance the performance of the snake optimization algorithm.
(1)
Dynamic Update Mechanism
In the SO algorithm, the food quantity is crucial for determining whether the algorithm is in the exploration stage or the exploitation stage. Based on ref. [27], a disturbance factor is introduced into Equation (3) to achieve a new dynamic update for c1. The new c1 is calculated according to Equation (10).
c 1 new = c 1 + 1 10 × cos r 1 4 × π 2
Here, r1 is a random number between 0 and 1.
During the exploration stage, the position updates for male and female snakes as they search for food are calculated using Equation (4). In this paper, a disturbance factor is added to achieve a new dynamic update for c2. The new c2 is calculated according to Equation (11).
c 2 new = c 2 + 1 1000 × cos r 2 4 × π 2
Here, r2 is a random number between 0 and 1.
During the exploitation stage, the positions of male and female snakes are calculated by Equations (5)–(7). A sine factor is introduced here to improve the algorithm’s convergence speed, and the new c3 is calculated according to Equation (12).
c 3 new = c 3 sin t T 4 × π 2
(2)
BPED Search Mechanism
To enhance population diversity in the SO algorithm and improve its optimization capability, an innovative strategy called Bidirectional Population Evolution Dynamics (BPED) is introduced, which replaces the random search during egg hatching in the original SO.
First, based on the widely observed Pareto principle in nature, the top 20% of individuals with the highest fitness in the population are retained and allowed to undergo natural variation. For this high-quality population composed of the top 20% of individuals, new mutated individuals, denoted as X′, are obtained according to Equations (13) and (14) based on [28].
X good , new t = X q + w X best ϕ X k
X good , new t + 1 = X good , new if   f good , new < f i X i else
Here, Xq and Xk represent two high-quality individuals distinct from Xi, both belonging to the top 20% of the population; Xbest denotes the fittest individual in the population, while w represents a mutation factor based on a sine function, with its exact expression given by Equation (15).
w = sin 2 π t + π D i m D i m t + T 2 T
Here, Dim represents the dimension of the data, T denotes the total number of iterations, and t indicates the current iteration count of the algorithm.
The second component of the BPED strategy involves mutation, elimination, and population migration, applying the PED strategy to the remaining 80% of individuals. This process generates new individuals, denoted as X′. The latter 80% of individuals are randomly divided into two groups. Individuals randomly assigned to the first group undergo variation around the best individual in the population, as shown in Equation (16).
X bad , new ( t ) = X best + sign r a n d 0.5 l b i a p + r a n d u b i a p l b i a p
Among the remaining 80% of individuals, the other part will migrate according to Equation (17), relocating themselves near their initial positions to conduct further exploration.
X bad , new t + 1 = X bad , new t - 2 sign r a n d 0.5 l b + r a n d u b l b
Through the two strategies proposed in this study, the original snake optimizer (SO) has been enhanced. These skillfully designed modifications have significantly improved the algorithm’s convergence speed, accuracy, and stability.

3.3. Algorithm Validation

In this study, four test functions (F1–F4) from the CEC2017 benchmark set are selected to validate the convergence performance of the ISO algorithm. To ensure a comprehensive evaluation, the performance of the proposed ISO is compared against four widely recognized benchmark optimizers: the standard snake optimizer (SO), whale optimization algorithm (WOA), gray wolf optimizer (GWO), and particle swarm optimization (PSO).
(1)
The WOA is a nature-inspired metaheuristic algorithm proposed by Mirjalili and Lewis [29]. It mimics the social behavior and unique hunting strategy of humpback whales, specifically the bubble-net feeding technique. The algorithm mathematically models three main phases: encircling prey, spiral bubble-net attacking, and searching for prey. The WOA updates the position of a search agent (whale) using the following key equations:
Whales circle the prey during the hunt. This behavior is formulated as
D = C X * t X t
X t + 1 = X * t A D
where X * is the position of the best solution obtained so far (prey), and X is the position of the current whale.
To simulate the spiral bubble-net attacking motion, a helix-shaped movement is formulated as
X t + 1 = D e b l cos 2 π l + X * t
where D = X * t X t represents the distance between the whale and the prey, b is a constant defining the shape of the logarithmic spiral, and l is a random number in [−1, 1].
(2)
The GWO algorithm is a population-based metaheuristic algorithm proposed by Mirjalili [30]. It mimics the social hierarchy and cooperative hunting mechanisms of gray wolves in nature. The algorithm mathematically models four types of agents: Alpha (α, the leader), Beta (β, the subordinates), Delta (δ, the followers), and Omega (ω, the lowest ranking). The optimization process relies on three main phases: encircling prey, hunting, and attacking. The GWO algorithm updates the position of a search agent (wolf) using the following key equations:
Wolves circle the prey during the hunt. This behavior is formulated as
D = C X p t X t
X t + 1 = X p t A D
where X p is the position of the prey (optimal solution), and X is the position of the current gray wolf.
Since the Alpha, Beta, and Delta wolves have better knowledge of the prey’s location, they guide the Omega wolves. The position update is calculated using these three leaders:
X t + 1 = X 1 + X 2 + X 3 3
where X1, X2, and X3 are the updated positions influenced by Alpha, Beta, and Delta, respectively, calculated using the encircling equations.
(3)
The PSO algorithm was originally proposed by Kennedy and Eberhart [31]. The canonical PSO algorithm updates the velocity and position of each particle i in a D-dimensional search space using the following two equations.
The velocity of particle i at iteration t + 1 is calculated as
v i , d t = 1 = w v i , d t + c 1 r 1 p b e s t i , d x i , d t + c 2 r 2 g b e s t d x i , d t
The new position of particle i is determined by adding the updated velocity to its current position:
x i , d t + 1 = x i , d t + v i , d t + 1
Here, x i , d t and v i , d t are the position and velocity of particle i in dimension d at iteration t; w is the inertial weight; c1 and c2 are acceleration coefficients; r1 and r2 are random numbers uniformly distributed in [0, 1]; p b e s t i , d is the personal best position achieved by particle i so far; g b e s t d is the global best position found by the entire swarm so far.
These three algorithms were selected because they represent the evolutionary trajectory of swarm intelligence algorithms and are commonly applied in hyperparameter tuning tasks for machine learning models, providing a robust baseline for comparison.
The convergence curves of each optimization algorithm are shown in Figure 1.
It can be seen that compared to the SO algorithm and other classical optimization algorithms, the ISO algorithm has achieved significant improvements in convergence speed, stability, and accuracy. Therefore, this paper applies the ISO algorithm to the hyperparameter optimization of the multi-label random forest.

3.4. Hyperparameter Optimization of Multi-Label Random Forest Based on ISO

While the standard snake optimizer (SO) demonstrates promising performance, it may encounter limitations such as premature convergence when handling complex feature spaces. To address this, we propose an improved snake optimizer (ISO). Subsequently, we integrate this enhanced optimizer with a multi-label random forest classifier, resulting in the ISOMLRF hybrid model, which serves as the primary experimental framework for this study.
The hyperparameter optimization process of the MLRF based on the improved snake optimizer (ISO) algorithm is illustrated in Figure 2. First, a feature parameter dataset and a multi-label dataset for model training are constructed. Next, the hyperparameters of the MLRF (including the number of trees, maximum depth, minimum leaf size, and feature selection ratio) are introduced from the iterative ISO algorithm, and multi-label decision trees are built accordingly. Finally, the MLRF classification model is trained using the dataset, and the training error (Hamming Loss) is computed as the fitness to iteratively optimize the hyperparameters.

3.5. Diagnostic Procedure for Constrained PFR Capability in Units

The diagnostic procedure for constrained PFR capability in coal-fired units, based on the ISOMLRF, is illustrated in Figure 3.
In practical applications, online diagnosis of constrained PFR capability can be achieved by identifying the constrained regulation process online, extracting the characterization parameters of the PFR capability, and utilizing the ISO-MLRF diagnostic model to obtain predicted classification labels.

3.6. Evaluation of Model Effectiveness

Multi-label classification algorithms are typically evaluated using the following metrics [12]:
(1)
Hamming Loss
F = 1 n L i = 1 n j = 1 L y p r e d i c t   i j y t e s t   i j
where n is the number of samples in the test set; L is the number of labels; ypredictij is the predicted value; ytestij is the test/true value.
(2)
Macro-averaging F1 Score
F = 1 L j = 1 L 2 × T P j T P j + F P j + F N j × T P j T P j + F N j
(3)
Alarm Rates
False Alarm Rate (FAR):
F A R = F P T N + F P
Missed Alarm Rate (MAR):
M A R = F N T P + F N
where TN is the number of correctly predicted normal samples; TP refers to the number of correctly predicted abnormal samples (True Positives); FN refers to the number of incorrectly predicted normal samples (False Negatives); FP refers to the number of incorrectly predicted abnormal samples (False Positives).

3.7. Experimental Platform and Implementation Details

To ensure computational efficiency, all experiments were performed on a workstation configured with an Intel Core i7-14700KF processor (up to 5.6 GHz), an NVIDIA RTX 4090 graphics card (24 GB GDDR6X), and 32 GB of DDR5 RAM, operating on Windows 11. The proposed methodology was developed within the MATLAB R2024a environment. Specifically, we leveraged the Parallel Computing Toolbox to accelerate computations and the Statistics and Machine Learning Toolbox for data analysis.

4. Experimental Result and Analysis

4.1. Simulation Data

4.1.1. Data Generation

To rigorously evaluate the performance of the proposed ISO-MLRF algorithm, two sets of multi-label synthetic datasets were generated following the standard procedures established in [32]. The generation process was implemented using MATLAB R2024a and consisted of the following steps:
Initially, the algorithm defines a D-dimensional space and randomly generates K hypercubes, where K is the number of labels. Each hypercube is defined by its center coordinates and a side length, determining its boundaries. To create a synthetic data point, the algorithm randomly selects a subset of these hypercubes. It then samples a point within the geometric intersection or proximity of the selected hypercubes.
The core mechanism relies on geometric overlap: if a point falls inside a specific hypercube’s boundaries, it is assigned the corresponding label (value 1); otherwise, it is 0. By controlling the centers and sizes of the hypercubes, the method regulates label co-occurrence and correlation. For instance, overlapping hypercubes generate positive examples for multiple labels simultaneously, effectively simulating real-world multi-label scenarios where features trigger multiple outcomes.
Finally, Gaussian noise is added to the points to increase data realism and complexity. The output is a dataset of feature vectors paired with a binary label vector indicating the presence or absence of each label. This method ensures a balanced and controllable generation of complex label dependencies.
To validate the classification performance of the ISO-MLRF algorithm, this paper generates two sets of simulation datasets. Each dataset contains 5000 samples: dataset 1 includes 20 feature parameters and three labels, and dataset 2 includes 30 feature parameters and 10 labels.

4.1.2. Simulation Analysis

To evaluate the effectiveness of the proposed ISOMLRF algorithm, we conducted a comparative study against one baseline and three benchmark algorithms. Specifically, the comparison includes the standard snake optimizer algorithm (SOMLRF) as the baseline, and the whale optimization algorithm (WOAMLRF), the gray wolf optimizer algorithm (GWOMLRF), and the particle swarm optimization algorithm (PSOMLRF) as benchmarks. For dataset 1 and dataset 2, the iterative convergence curves of the five hyperparameter optimization algorithms on the training set are shown in Figure 4.
It can be seen that for dataset 1, the iteration count and Hamming Loss at convergence for the ISOMLRF algorithm proposed in this paper are both lower than those of the other algorithms, indicating that this algorithm offers superior performance in optimizing random forest hyperparameters. The convergence patterns of the various algorithms on the training set of dataset 2 are similar to those of dataset 1.
The five trained optimized classification diagnostic models and the unoptimized MLRF model were simultaneously applied to the test datasets. The Hamming Loss of the six multi-label classification diagnostic algorithms on the test set is shown in Table 3, while the macro-averaging F1 Score and the average model diagnosis time are presented in Table 4 and Table 5, respectively.
It can be observed that for dataset 1 and dataset 2, which have different numbers of features and labels, the ISO-MLRF model achieves the smallest Hamming Loss on the test sets, with values of 0.141 and 0.160, respectively. It also yields the highest macro-averaging F1 Scores, at 0.861 and 0.841, respectively. The average diagnosis times are 0.746 ms and 0.760 ms, respectively. This demonstrates that the ISO-MLRF model has superior classification and diagnostic performance.
To better compare the classification effectiveness of the different diagnostic algorithms, Figure 5a,c present the FAR and MAR for the classification diagnosis of dataset 1. Similarly, Figure 5b,d show the FAR and MAR for the classification diagnosis of dataset 2.
It can be seen that, for both dataset 1 and dataset 2, the ISOMLRF algorithm exhibits lower and more stable false alarm rates and missed alarm rates compared to other algorithms. Consequently, it can more effectively achieve the classification and diagnosis of the simulated datasets.

4.2. Plant Data

4.2.1. Data Acquisition

This study utilizes historical DCS monitoring data from a 660 MW ultra-supercritical coal-fired unit, with a 1 s sampling interval. Directly sampled features include parameters such as power, grid frequency, rotational speed, main steam pressure, main steam temperature, intermediate point pressure, intermediate point temperature, valve opening, and AGC command.
The sliding window method is used to extract data segments (60 s) corresponding to PFR events. For each PFR segment, process features are calculated, including response delay time, load response time, pressure change rate, intermediate point temperature change rate, power change rate, and valve actuation rate. Figure 6 shows the curves for ramp-up and ramp-down segments of extracted PFR process data.
Using PFR performance indicators (response delay time, load response time, regulation accuracy, and contribution rate), PFR data segments with constrained capability are selected. Corresponding datasets for characterizing the unit’s PFR capability and a multi-label dataset for constrained regulation capability are constructed separately. The label data represent the reasons for constrained PFR capability of the unit, with each label value being “0” or “1”. A value of “0” indicates that the specific reason for constrained capability corresponding to that label did not occur, while “1” indicates that the label is a cause of the unit’s constrained PFR capability.
When collected data significantly deviates from the normal range or contains gaps, data cleaning and gap filling are required. Considering the impact of thermal inertia on the boiler, this study directly fills missing data gaps using forward fill to ensure data integrity. Methods for detecting outliers in the data include the median absolute deviation (MAD) method, the standard deviation method (3σ), and the percentile method. Given sufficient data volume and an adequate number of feature types, this study employs the standard deviation method to detect anomalous data.

4.2.2. Multi-Label Classification Dataset

Based on the analysis in Section 2.1 and Section 2.2, a dataset for constrained PFR capability in coal-fired units is constructed. The dataset consists of 1000 samples, containing 18 feature parameters and 15 labels for constrained PFR capability. The generated characterization dataset for the unit’s PFR capability is shown in Table 6, and the multi-label dataset for constrained PFR capability is shown in Table 7.

4.2.3. Results Analysis

To ensure reproducibility and provide a clear overview of the optimization process, Table 8 summarizes the specific hyperparameters considered, their respective search ranges, and the final optimal values identified by the optimization process. These tuned parameters were subsequently used for all experiments reported in this study.
To illustrate the predictive accuracy of the classification for each label in the test set, the confusion matrix is shown in Figure 7. It can be observed that the number of samples where the “Negative class” is predicted as the “Positive class” (False Positives) is one, and the total number of samples where the “Positive class” is predicted as the “Negative class” (False Negatives) is 28. This indicates that the classification algorithm has high accuracy. The low number of false positives is due to the relatively small number of “Positive class” samples for each label.
To compare the multi-label classification prediction performance of various algorithms, Table 9 presents the false alarm rates, missed alarm rates, training time, and testing time of the six algorithms for the classification diagnosis of the unit’s field dataset.
As illustrated in Table 9, the primary advantage of ISOMLRF lies in its ability to balance precision and robustness. Compared to SOMLRF, our method reduces the average missing alarm rate by 27% (from 0.073 to 0.053). This improvement is attributed to the dynamic update mechanisms in ISO, which effectively prevent the model from falling into local optima—a common issue in traditional swarm intelligence algorithms.
Although the training time of ISOMLRF is slightly higher than that of the static MLRF due to hyperparameter optimization, it remains substantially lower than that of SOMLRF and PSOMLRF (12.5 s vs. 48.3 s and 41.5 s). This indicates that the proposed algorithm offers a more efficient convergence rate, making it more suitable for practical engineering applications where both diagnostic accuracy and computational cost are critical.
More importantly, in the context of practical application, the average FAR of 0.029% achieved by ISOMLRF is substantially lower than the 1–5% range typically reported in the recent literature for similar coal-fired unit fault diagnosis tasks [33]. This confirms that our method not only optimizes hyperparameters effectively but also enhances the model’s robustness against noisy operational data.
Furthermore, previous studies [6,8] primarily relied on physical modeling, which often fails to capture the complex interdependencies among multiple constraints. In contrast, our ISOMLRF algorithm, by utilizing a multi-label random forest optimized by an improved snake optimizer, provides a more holistic diagnosis.
Despite the superior diagnostic accuracy demonstrated by the proposed ISOMLRF algorithm, several limitations must be acknowledged. Firstly, its computational burden during the training phase is higher than that of static machine learning models (MLRF, SVM) due to the iterative optimization of hyperparameters. Furthermore, as an ensemble learning model, the inherent ‘black-box’ nature of the random forest limits the interpretability of the diagnosis results for field engineers. Future work will focus on optimizing the algorithm’s computational efficiency and exploring hybrid approaches to enhance model transparency.
To provide deeper insights into the model’s decision-making logic, Figure 8 illustrates the calculated feature importance scores for each specific fault label. The bar chart clearly highlights the primary driving parameters behind every diagnostic category, making the complex internal correlations within the ISO-MLRF model transparent. This granular visualization not only validates the physical consistency of the algorithm but also empowers plant operators by identifying the exact key indicators to monitor for specific PFR constraints, thereby significantly enhancing the interpretability and operational transparency of the diagnostic system.

5. Conclusions

(1)
The underlying constraints limiting the PFR capability of coal-fired units were analyzed. By integrating current standards with DCS measurement points, a set of quantitative characterization parameters was established, including response time, settling time, and 90% target time. Furthermore, a comprehensive set of limiting factors affecting PFR capability was identified.
(2)
The snake optimizer algorithm was enhanced by introducing a dynamic update mechanism and a novel search strategy, effectively improving its convergence speed and accuracy. Subsequently, a multi-label random forest classification algorithm based on the improved snake optimizer (ISO-MLRF) was proposed. This algorithm optimizes the hyperparameters of the random forest, thereby significantly improving classification accuracy.
(3)
The model was validated using both synthetic multi-label simulation datasets and actual operational data from a 660 MW ultra-supercritical unit. Results on the simulation datasets demonstrate that, compared to other benchmark algorithms, the ISO-MLRF algorithm achieves the highest convergence accuracy with superior convergence speed. Results on the actual unit dataset show that ISO-MLRF attains higher multi-label classification accuracy, with an average false alarm rate of 0.029% and an average missed alarm rate of 5.3% per label, both lower than or equal to those of comparative algorithms. Additionally, the average diagnosis time for a single sample is 9.4 ms, satisfying the requirements for online diagnosis.
(4)
Deploying the proposed multi-label random forest classification algorithm on the PFR capability monitoring and diagnosis platform enables real-time online diagnosis of constrained PFR capability. This provides crucial technical support for enhancing the regulation capability of source-side units and optimizing grid-side dispatch.
(5)
The future scope of this research is outlined. First, although ISO-MLRF performs well, its computational burden during offline training is relatively high; future work will focus on developing lightweight versions for edge computing devices. Second, the current model relies heavily on historical data quality; integrating physics-informed constraints into the data-driven framework could enhance its generalization under extreme or unseen conditions. Third, the application of this diagnostic framework will be extended to other types of power generation units, such as gas turbines and energy storage systems, to verify its universality.

Author Contributions

Conceptualization, Y.D. and H.L.; methodology, Y.D. and H.W.; software, J.Y.; validation, Z.L., J.L., and D.H.; formal analysis, Y.Z.; investigation, Z.L.; resources, J.Y.; data curation, Y.Z.; writing—original draft preparation, J.Y.; writing—review and editing, Y.D.; visualization, Y.Z.; supervision, Y.D. and H.W.; project administration, H.L. and H.W.; funding acquisition, H.L. All authors have read and agreed to the published version of the manuscript.

Funding

This work is supported by Science and Technology Project of State Grid Zhejiang Electric Power Co., Ltd. (No. B311DS25Z012).

Data Availability Statement

The data supporting this study can be obtained upon request from the corresponding author. However, due to privacy considerations and the presence of undisclosed intellectual property, these data are not accessible to the public.

Conflicts of Interest

Authors Hongkun Lv and Zhenya Lai were employed by the State Grid Zhejiang Electric Power Company. Author Huahua Wu was employed by State Grid Zhejiang Electric Power Co., Ltd. Authors Jing Li and Dongyu Hua were employed by the E.Energy Technology Co., Ltd. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:
ISOImproved snake optimizer
MLRFMulti-label random forest
PFRPrimary frequency regulation
LSTMLong short-term memory
PSOParticle swarm optimization
GWOGray wolf optimizer
GAGenetic algorithm
SOSnake optimizer
BYSBayesian algorithm
FARFalse alarm rate
MARMissed alarm rate

References

  1. Hao, L.; Chen, L.; Huang, Y. Challenges and prospects of primary frequency regulation of coal-fired thermal power units under new power systems. Power Syst. Autom. 2024, 48, 14–29. [Google Scholar]
  2. Chen, G.; Dong, Y.; Liang, Z. Analysis and reflection on high-quality development of new energy with Chinese characteristics in energy transition. Proc. CSEE 2020, 40, 5493–5506. [Google Scholar]
  3. GB/T 40595-2021; Grid-Connected Power Source Primary Frequency Regulation Technical Specifications and Test Guidelines. National Standardization Management Committee: Beijing, China, 2021.
  4. Guo, Y.; Xu, F.; Hao, L. Boiler modeling and online determination of parameters in primary frequency regulation. Chin. J. Electr. Eng. 2023, 43, 6551–6562. [Google Scholar]
  5. Huang, Y.; Hao, L.; Chen, L. Mathematical modeling of extracted condensing steam turbine for multiple PFR technologies and changing operating conditions. Chin. J. Electr. Eng. 2024, 44, 6065–6078. [Google Scholar]
  6. Zhang, W.; Fang, F.; Dong, Y.; Liu, J. Assessment of primary frequency regulation capability of coal-fired power units for deep peak shaving. Proc. CSEE 2025. [Google Scholar] [CrossRef]
  7. Wang, J.; Su, J.; Zhao, Y. Performance assessment of primary frequency control responses for thermal power generation units using system identification techniques. Int. J. Electr. Power Energy Syst. 2018, 100, 81–90. [Google Scholar] [CrossRef]
  8. Pang, D.; Qin, T.; Du, M.; Niu, Y. Stability improvement of boiler heat storage utilization by observer based on PID. J. Chin. Soc. Power Eng. 2025, 45, 1905–1913. [Google Scholar]
  9. Jin, F.; Hao, X.; Wang, B. Modeling of PFR capability of thermal power units based on QPSO-LSTM network. Therm. Power Eng. 2023, 38, 80–87. [Google Scholar]
  10. Zhang, X.; Wang, Z.; Xia, D. On-line estimation of primary frequency regulation capability of deep peak regulation thermal power unit based on LSTM neural network. Therm. Power Gener. 2023, 52, 172–178. [Google Scholar]
  11. Tang, M.; Liang, Z.; Ji, D. Inadequate load output diagnosis of ultra-supercritical thermal power units based on MIWOA multi-label random forest. Appl. Therm. Eng. 2023, 227, 120386. [Google Scholar] [CrossRef]
  12. Wang, R.; Ye, S.; Li, K.; Kwong, S. Bayesian network based label correlation analysis for multi-label classifier chain. Inf. Sci. 2021, 554, 256–275. [Google Scholar] [CrossRef]
  13. Shan, J.; Hou, C.; Tao, H. Co-learning binary classifiers for LP-based multi-label classification. Cogn. Syst. Res. 2019, 55, 146–152. [Google Scholar] [CrossRef]
  14. Li, P.; Liu, Z.; Anduv, B. Diagnosis for multiple faults of chiller using ELM-KNN model enhanced by multi-label learning and specific feature combinations. Build. Environ. 2022, 214, 108904. [Google Scholar] [CrossRef]
  15. Liu, Y.; Liu, C.; Shen, Y. Non-intrusive energy estimation using random forest based multi-label classification and integer linear programming. Energy Rep. 2021, 7, 283–291. [Google Scholar] [CrossRef]
  16. Luo, J.; Wang, L.; Gao, W.; Jiang, H. Prediction of ventilation air methane explosion in regenerative thermal oxidation based on hyperparameter-optimized random forest algorithm. J. Loss Prev. Process Ind. 2025, 98, 105757. [Google Scholar] [CrossRef]
  17. Ruba, A.; Cenk, B.; Idil, C. Integrating metaheuristic optimization algorithms with random forest to predict waste generation in construction and demolition projects. Autom. Constr. 2026, 182, 106732. [Google Scholar]
  18. Antanios, K.; Ali, B.; Bassel, S. Enhancing CNN-based network intrusion detection through hyperparameter optimization. Intell. Syst. Appl. 2025, 26, 200528. [Google Scholar]
  19. Zahraa, A.; Alhussan, D.; Sami, K. A snake optimization algorithm-based feature selection framework for rapid detection of cardiovascular disease in its early stages. Biomed. Signal Process. Control 2025, 102, 107417. [Google Scholar]
  20. Sun, D.; Wen, H.; Wang, D. A random forest model of landslide susceptibility mapping based on hyperparameter optimization using Bayes algorithm. Geomorphology 2020, 362, 107201. [Google Scholar] [CrossRef]
  21. Hashim, F.; Hussien, A. Snake Optimizer: A novel meta-heuristic optimization algorithm. Knowl.-Based Syst. 2022, 242, 108320. [Google Scholar] [CrossRef]
  22. Wang, G.; Hao, T.; Zhang, J.; Liu, K. Analysis on influencing factors of passing rate of primary frequency regulation of thermal power units. Electr. Power 2014, 47, 23–26. [Google Scholar]
  23. Tang, Y.; Wan, J.; Guo, W. The diagnosis and solution of an unstable fault of turbine primary frequency modulation performance. Turbine Technol. 2018, 60, 371–374. [Google Scholar]
  24. Zhang, X.; Wang, J.; Guo, H. Analysis and treatment of load fluctuation caused by abnormal steam turbine governing system. Power Syst. Eng. 2023, 39, 57–59. [Google Scholar]
  25. GB/T 40590-2021; Guide for Technology and Test on Primary Frequency Control of Grid-Connected Power Resource. National Standardization Management Committee: Beijing, China, 2021.
  26. GB/T 30370-2022; Guide for Primary Frequency Control Test and Performance Acceptance for Thermal Power Generating Units. National Standardization Management Committee: Beijing, China, 2022.
  27. Hu, G.; Yang, R.; Abbas, M. Multi-strategy boosted snake-inspired optimizer for engineering applications. J. Bionic Eng. 2023, 20, 1791–1827. [Google Scholar] [CrossRef]
  28. Zhu, Y.; Huang, H.; Wei, J. ISO: An improved snake optimizer with multi-strategy enhancement for engineering optimization. Expert Syst. Appl. 2025, 281, 127660. [Google Scholar] [CrossRef]
  29. Mirjalili, S.; Lewis, A. The whale optimization algorithm. Adv. Eng. Softw. 2016, 95, 51–67. [Google Scholar] [CrossRef]
  30. Mirjalili, S.; Mirjalili, S.M.; Lewis, A. Grey wolf optimizer. Adv. Eng. Softw. 2014, 69, 46–61. [Google Scholar] [CrossRef]
  31. Kennedy, J.; Eberhart, R. Particle swarm optimization. In Proceedings of ICNN’95—International Conference on Neural Networks; IEEE: Piscataway, NJ, USA, 1995; Volume 4, pp. 1942–1948. [Google Scholar]
  32. Jimena, T.; Newton, S. A framework to generate synthetic multi-label datasets. Electron. Notes Theor. Comput. Sci. 2014, 302, 155–176. [Google Scholar] [CrossRef]
  33. Xue, F. DE-FE0031763: Deep Analysis Net with Causal Embedding for Coal Fired Power Plant Fault Detection and Diagnosis (DANCE4CFDD); National Energy Technology Laboratory: Morgantown, WV, USA, 2020.
Figure 1. Convergence curves of optimization algorithms.
Figure 1. Convergence curves of optimization algorithms.
Processes 14 01658 g001
Figure 2. Hyperparameter optimization process of MLRF based on the ISO algorithm.
Figure 2. Hyperparameter optimization process of MLRF based on the ISO algorithm.
Processes 14 01658 g002
Figure 3. Diagnosis flowchart of constrained PFR capability for coal-fired units based on ISO-MLRF.
Figure 3. Diagnosis flowchart of constrained PFR capability for coal-fired units based on ISO-MLRF.
Processes 14 01658 g003
Figure 4. Iteration convergence curve for datasets.
Figure 4. Iteration convergence curve for datasets.
Processes 14 01658 g004
Figure 5. FARs and MARs for the two datasets.
Figure 5. FARs and MARs for the two datasets.
Processes 14 01658 g005
Figure 6. PFR process curves.
Figure 6. PFR process curves.
Processes 14 01658 g006
Figure 7. Confusion matrix of each label.
Figure 7. Confusion matrix of each label.
Processes 14 01658 g007
Figure 8. Feature importance of each label.
Figure 8. Feature importance of each label.
Processes 14 01658 g008
Table 1. The main reasons for the constrained PFR capability of the unit.
Table 1. The main reasons for the constrained PFR capability of the unit.
Main CategoriesMain Reasons
Operational conditionHigh load rate or low load rate
Insufficient equipment reserve capacity
Boiler heat storage capacityHigh thermal inertia of the boiler
Steam-water system delay
Improper coordinated control mode
Poor regulation system performanceDelayed governor response
Low control valve accuracy
Logic conflict between CCS and DEH
Improper control system parameter settingsImproper frequency regulation coefficient
Stringent limiter settings
Excessive deadband setting
Auxiliary equipment and systemsSlow response of the condensate pump
Slow response of the feedwater or heating regulation
Equipment agingActuator aging
Sensor accuracy degradation
Table 2. Characteristic parameters of PFR capability [6,25,26].
Table 2. Characteristic parameters of PFR capability [6,25,26].
No.Characterization ParametersUnitThreshold
1Response delay times<3
2Time to reach 90% of the target load responses<30
3Deadbandr/min±2
4Speed governing droop%4~5
5Load regulation quantity%6~10
6Settling times<60
7Contribution energy%>90
8Valve actuation rate%/s>5
9Main steam pressure change rateMPa/s<0.8
10Intermediate point temperature change rate°C/s<2
11Power adjustment rate%/s>1
12Initial load rate%-
13Initial main steam pressureMPa-
14Initial frequency deviationHz-
15Reserve capacity%>2
16Frequency regulation power limit%>±5
17Deviation between valve opening and command%<±2
18Condensate flow regulation rates<30
Table 3. Hamming loss.
Table 3. Hamming loss.
Test DataISOSOWOAGWOPSOMLRF
data 10.1410.1530.1570.1590.1550.234
data 20.1600.1770.1710.1700.1700.254
Table 4. Macro-average F1 score.
Table 4. Macro-average F1 score.
Test DataISOSOWOAGWOPSOMLRF
data 10.8610.8460.8410.8430.8450.767
data 20.8410.8250.8260.8320.8310.746
Table 5. Average diagnosis time of the model.
Table 5. Average diagnosis time of the model.
Test DataISOSOWOAGWOPSOMLRF
data 10.7460.7610.7830.7430.7390.767
data 20.7600.7710.7920.8320.7750.787
Table 6. Characterization parameters dataset of PFR capability for coal-fired unit.
Table 6. Characterization parameters dataset of PFR capability for coal-fired unit.
No.Response Delay TimeSettling
Time
90% Target Load Response TimeValve Opening DeviationDelay
Time
13.3547.4634.410.03912.62
22.2246.8839.830.04019.18
33.9466.1933.820.00442.99
42.1352.5241.430.04145.42
10003.9430.5724.87 0.01043.69
Table 7. Multiple-label dataset of constrained PFR capability.
Table 7. Multiple-label dataset of constrained PFR capability.
No.Governor Response DelayPoor Valve Control AccuracyExcessive Dead ZoneActuator Aging
PVAVPVAVPVAVPVAVPVAV
100000010
200001100
311000000
400001100
100000000000
Note: PV—predicted value, AV—actual value.
Table 8. The specific hyperparameters.
Table 8. The specific hyperparameters.
HyperparameterDescriptionSearch RangeOptimal Value
numTreesNumber of trees[10, 200]182
maxDepthMaximum depth[10, 30]20
minLeafSizeMinimum leaf size[1, 30]15
numFeaturesFeature selection ratio[0.1, 0.8]0.583
Table 9. The performance of the classification diagnosis algorithm for the coal-fired unit’s dataset.
Table 9. The performance of the classification diagnosis algorithm for the coal-fired unit’s dataset.
Test DataISOSOWOAGWOPOSMLRF
FAR0.029%0.029%0.055%0.055%0.029%0.071%
MAR0.0530.0730.0530.0620.0610.083
Testing Time(s)0.00940.01290.01190.01150.01120.0058
Training Time(s)12.548.345.242.041.55.5
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Dong, Y.; Lv, H.; Wu, H.; Yang, J.; Lai, Z.; Zhang, Y.; Li, J.; Hua, D. Intelligent Diagnosis Method for Constrained Primary Frequency Regulation Capacity of Coal-Fired Units Based on ISO-MLRF. Processes 2026, 14, 1658. https://doi.org/10.3390/pr14101658

AMA Style

Dong Y, Lv H, Wu H, Yang J, Lai Z, Zhang Y, Li J, Hua D. Intelligent Diagnosis Method for Constrained Primary Frequency Regulation Capacity of Coal-Fired Units Based on ISO-MLRF. Processes. 2026; 14(10):1658. https://doi.org/10.3390/pr14101658

Chicago/Turabian Style

Dong, Yuliang, Hongkun Lv, Huahua Wu, Jinghui Yang, Zhenya Lai, Yi Zhang, Jing Li, and Dongyu Hua. 2026. "Intelligent Diagnosis Method for Constrained Primary Frequency Regulation Capacity of Coal-Fired Units Based on ISO-MLRF" Processes 14, no. 10: 1658. https://doi.org/10.3390/pr14101658

APA Style

Dong, Y., Lv, H., Wu, H., Yang, J., Lai, Z., Zhang, Y., Li, J., & Hua, D. (2026). Intelligent Diagnosis Method for Constrained Primary Frequency Regulation Capacity of Coal-Fired Units Based on ISO-MLRF. Processes, 14(10), 1658. https://doi.org/10.3390/pr14101658

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop