Next Article in Journal
Enhancing Strategic Decision-Making in Fraud Management: A Dual-Channel Framework with TOPSIS for Credit Card Fraud Detection
Previous Article in Journal
Capacitor State Monitoring Based on Haar Wavelet Transform and Enhanced Kalman Filter
Previous Article in Special Issue
AI at Sea, Year Six: Performance Evaluation, Failures, and Insights from the Operational Meta-Analysis of SatShipAI, a Sensor-Fused Maritime Surveillance Platform
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Permeability Index Modeling with Multiscale Time Delay Characteristics Excavation in Blast Furnace Ironmaking Process

State Key Laboratory of Industrial Control Technology (ICT), Zhejiang University, Hangzhou 310027, China
*
Author to whom correspondence should be addressed.
Electronics 2025, 14(23), 4670; https://doi.org/10.3390/electronics14234670
Submission received: 30 October 2025 / Revised: 24 November 2025 / Accepted: 24 November 2025 / Published: 27 November 2025

Abstract

The permeability index (PI) is a key comprehensive indicator that reflects the smoothness of internal gas flow in pig iron production via blast furnace. An accurate prediction for it is essential for forecasting abnormal furnace conditions and preventing potential faults. However, developing an early prediction model for PI has been neglected in existing research, and it faces massive challenges due to the strong nonlinearity, undesirable nonstationarity, and significant multiscale time delays inherent in the blast furnace data. To bridge this gap, a new modeling paradigm for PI is proposed to explore the inherent time delay characteristics among multiple variables. First, the data are progressively decomposed into multiple components using wavelet decomposition and spike separation. Then, a novel delay extraction method based on wavelet coherence analysis is developed to obtain accurate multiscale time delay knowledge. Furthermore, the integration of Orthonormal Subspace Analysis (OSA) and wavelet neural network (WNN) achieves comprehensive modeling across time and frequency domains, incorporating global and local features. A Gauss–Markov-based fusion framework is also utilized to reduce the output error variance, ultimately enabling the early prediction of PI. Mechanism analysis and a practical case study on blast furnace production verify the effectiveness of the proposed target-oriented prediction framework.

1. Introduction

Blast furnace production (BFP), which produces molten pig iron, is one of the most critical links in the entire steel manufacturing system, whose energy consumption accounts for more than 70% of the steel industry [1,2,3]. Uninterrupted heat transfer, momentum transfer, mass transfer, and various chemical reactions drive the blast furnace into a complicated system which has so far remained elusive in identification missions. Permeability index (PI) is an important operation parameter of BFP, reflecting whether the internal airflow of the blast furnace is smooth [4] and significantly impacting subsequent slag formation and desulfurization processes. This indicator is governed by a complex interplay between raw material properties and key operational parameters. The size, strength, and high-temperature behavior of the ferrous burden (sinter/pellets) and coke directly determine the voidage and resistance of the packed bed, forming the physical foundation for gas flow. Furthermore, the moisture content in raw materials can increase particle adhesion, alter burden distribution, and consume significant energy during vaporization, all adversely affecting permeability. In addition, the endothermic decomposition of flux not only produces fine particles but also consumes energy, thereby affecting thermal equilibrium. Simultaneously, operation variables such as the blast volume, temperature, and oxygen enrichment rate critically influence the gas flow rate, combustion zone dynamics, and the shape of the cohesive zone, thereby altering permeability. If the operating systems (including burden charging system, blast supply system, thermal system, and slag system as shown in Figure 1a) of BFP are selected unreasonably, faults are easy to occur inside blast furnaces, causing abnormal values of PI [5,6,7]. However, due to the strong nonlinearity and nonstationarity of the blast furnace process caused by the aforementioned factors, accurately predicting PI in advance remains challenging. Many existing studies have not fully addressed these intertwined challenges, especially the inherent time delay characteristics between variables, which inspired us to conduct this research.
The variation process of the PI has the following main characteristics:
  • Multiple Time Scales: Various parameters of BFP belong to different operating systems, so they have different time scales of influence on PI [8]. The burden charging operation on the upper part of blast furnace changes the PI primarily by affecting the burden distribution, which requires a long transition period of about 6–8 h. The oxygen enrichment rate changes the height of the soft melting zone and the size of the combustion zone by influencing the internal temperature distribution of the blast furnace, thereby affecting the PI. Since it mainly affects the lower burden of the blast furnace, the influence time scale is 15–30 min. The blast supply system directly affects the distribution of gas flow and pressure, thus completely altering PI in just a few minutes [9].
  • Nonstationarity: As illustrated in Figure 1b, due to the periodic equipment switching of the heated air subsystem, there occur some upward or downward spikes in the data, resulting in the data’s nonstationarity [10]. Such nonstationarity will cause a modal aliasing phenomenon when using time series decomposition methods for sequence decomposition, causing “data contaminated”. A detailed introduction to this section will be provided in Section 3.1.
  • Nonlinearity: Extremely complicated physical and chemical processes in BFP give temporal variables complex nonlinearity characteristics. This unknown-structured nonlinear relationship will pose challenges for modeling. The nonlinear relationship of PI is expressed as follows in Equation (1):
    PI ( t ) = f X 1 ( t t X 1 ) , θ X 1 , X 2 ( t t X 2 ) , θ X 2 , , X n ( t t X n ) , θ X n
    where f represents the nonlinear relationship between PI and other parameters, X represents observation parameters related to PI, θ is the nonlinear order of the parameter corresponding to the subscript, and the subscript n is the total number of these parameters.
Aiming at such a modeling problem, mechanism-based and data-based methods are the two prevalent approaches for studying industrial processes [11]. There are two main types of mechanism-based methods: fundamental analysis and computational modeling. The former qualitatively or semi-quantitatively analyzes the research objectives based on theories of heat transfer, mass transfer and so on, as well as the expert knowledge of blast furnace [8]. The latter simulates the internal state of blast furnace using numerical calculation methods such as computational fluid dynamics (CFD) [12,13] and discrete element method (DEM) [14,15], based on various physical and chemical laws, to obtain the changes of research targets. However, these mechanism-based methods have certain limitations. Fundamental analysis methods carry less credibility in tasks expecting specific numerical results, while computational modeling methods demand extensive time and substantial computational resources for modeling and result calculation, making mechanism-based methods challenging to perform the task of predicting PI.
Data-based methods can effectively handle the nonlinear relationships between industrial variables and uncover the spatiotemporal characteristics within the data to achieve the goal of accurately modeling the target variables [16]. So in recent years, most of the literature has achieved the goal of PI prediction through the application of different intelligent algorithms. Dong et al. performed multi-layer wavelet decomposition on PI to analyze the multiscale characteristic, and combined it with relevant operation parameters to establish the model using the least square support vector machines (LS-SVMs) method for each layer of wavelet coefficients [17]. Su et al. proposed a model for forecasting the future value of PI based on multi-layer extreme learning machine (ML-ELM) and wavelet transform. They solved the high multicollinearity problem of the last hidden layer output of ML-ELM through the principal component analysis (PCA) method, enhancing the generalization performance and stability of the model [18]. Tan et al. conducted a coupling mechanism analysis between PI and airflow, as well as Spearman correlation analysis and maximum information coefficient (MIC) analysis to select key parameters, and ultimately established an intelligent prediction model for blast furnace PI based on the wavelet neural network (WNN) [19]. Liu et al. decomposed the PI according to the difference of frequency bands based on variational mode decomposition (VMD) to obtain multiple sub-modes. For each sub-mode, they constructed a back propagation neural network (BPNN) model optimized by particle swarm optimization (PSO), improving the prediction accuracy of the model [20]. However, research on the PI of blast furnaces is still scarce. Existing studies mostly lack the ability to perform multi-step ahead prediction in advance, and ignore the intrinsic time delay characteristics between different variables and the data nonstationarity caused by the switching of the hot blast stove.
To address the above issues, we propose a PI prediction method based on multiscale time delay characteristic analysis that is guided by the mechanistic understanding of the BFP. Firstly, the nonstationarity characteristics of data caused by the spikes are eliminated through separation from the sequences. Secondly, the multiscale delay characteristics are dug up by the method of wavelet decomposition and wavelet coherence analysis. To our knowledge, this is the first time that wavelet coherence analysis has been applied to time series prediction tasks in the industrial field. Using wavelet coherence analysis, the temporal advance of operational variables relative to PI at different wavelet decomposition scales is obtained. Then, Orthonormal Subspace Analysis (OSA) and WNN are carried out for each wavelet decomposition scale to describe the nonlinearity of variables, and the fusion of submodels is implemented by Gauss–Markov estimation. The input variables are selected to ensure that the key thermodynamic and physicochemical phenomena—such as those governing the soft melting zone, combustion zone, and gas flow distribution—are adequately represented by the operation parameters. The experimental results on actual datasets demonstrate the high accuracy and stability of the proposed model. A detailed comparison between our model and the models proposed in [17,18,19,20] is summarized in Table 1.
The main contributions of this article lie in the following three aspects:
  • To address the challenge of traditional correlation methods in extracting delays under complex noise conditions in blast furnaces, this article proposes a novel delay extraction method based on wavelet coherence analysis from a multi-resolution perspective and explores reliable multiscale delay knowledge of PI for the first time.
  • To comprehensively address data nonstationarity and nonlinearity, this paper proposes a fusion prediction method based on OSA and WNN, which extracts the spatial characteristics between variables from different perspectives in both time and frequency domains, as well as global and local scales, thereby expanding the representation capability of PI.
  • To achieve accurate early prediction performance of PI, a practical advance prediction framework is built by fusing extracted delay information and integrating spatiotemporal dimensions through the Gauss–Markov estimation method. We conduct experiments on the proposed method and other traditional methods using actual data. Extensive experimental results demonstrate that the proposed method can consistently maintain good predictive performance. With the increase in prediction time interval, the advantages of the proposed method over other methods are more notable.
The remainder of this article is organized as follows. Section 2 introduces the proposed time delay extraction method based on wavelet coherence analysis. The modeling framework of our proposed prediction method is described in Section 3. Then, Section 4 demonstrates the effectiveness of our method through actual BFP cases. Finally, Section 5 provides a summary of the content of this article.

2. Multiscale Time Delay Analysis

2.1. Time Series Multiscale Decomposition

Wavelet transform is a local transformation of time and frequency. Through the scaling and translation of wavelet, wavelet transform achieves the truncation transformation of variable length time windows, which can effectively perform local location and extract information from the signal [21]. Meanwhile, as a time series decomposition method, the multi-resolution analysis ability of wavelet transform endows it with the ability to handle nonstationary time series. By dividing nonstationary time series into sub-sequences with different time distribution characteristics, the nonstationary reduction of each sub-sequence makes it easier to handle, thereby reducing the overall processing difficulty [22].
Assuming signal x ( t ) V 0 . Let ϕ ( t ) V 0 , make its integer translation { ϕ 0 , k ( t ) ϕ ( t k ) : k Z } satisfy
V 0 span { ϕ 0 , k ( t ) : k Z }
where span { S } represents linear spanning, ϕ ( t ) is called the scaling function, and V 0 is called the approximation space with unit scale. When the scale is changed, there are
ϕ j , k ( t ) ϕ 0 , k ( t 2 j ) 2 j = ϕ ( t 2 j k ) 2 j
Thus, given { ϕ 0 , k ( t ) : k Z } as the orthonormal basis of space V 0 , we can obtain { ϕ j , k ( t ) : k Z } constituting the orthonormal basis of space V j , where V j span { ϕ j , k ( t ) : k Z } . Subspace V j is called the approximation space of scale λ j = 2 j .
If { V j : j Z } can form a multi-resolution analysis (MRA), we can define
V j 1 = V j W j
Equation (4) means that any element in V j 1 can be represented as the sum of two mutually orthogonal elements, one belonging to V j and the other belonging to W j . Generally, V j V j 1 , and W j V j 1 is the orthogonal complement of V j in V j 1 . Subspace W j is called the detailed space of scale τ j = 2 j 1 , where τ j represents the characteristic period of the fluctuations captured in this subspace. Let ψ ( t ) W 0 . Similar to { ϕ 0 , k ( t ) : k Z } , we require { ψ 0 , k ( t ) ψ ( t k ) : k Z } to be the orthonormal basis of subspace W 0 . ψ ( t ) is called the mother wavelet function. Then we have
ψ j , k ( t ) ψ 0 , k ( t 2 j ) 2 j = ψ ( t 2 j k ) 2 j
And { ψ j , k ( t ) : k Z } constitutes the orthonormal basis of space W j .
Now x ( t ) V 0 can be expressed as the sum of its projection components on V 1 and W 1 using MRA:
x ( t ) = s 1 ( t ) + d 1 ( t )
where s 1 ( t ) and d 1 ( t ) are the projections of x ( t ) in subspace V 1 and W 1 , given by Equations (7) and (8), respectively:
s 1 ( t ) = k = v 1 , k ϕ 1 , k ( t ) = k = x ( t ) , ϕ 1 , k ( t ) ϕ 1 , k ( t )
d 1 ( t ) = k = w 1 , k ψ 1 , k ( t ) = k = x ( t ) , ψ 1 , k ( t ) ψ 1 , k ( t )
Here s 1 ( t ) is called the approximation of signal x ( t ) , and d 1 ( t ) is the missing detail of x ( t ) approximated by s 1 ( t ) . v 1 , k and w 1 , k are called the approximation coefficient of s 1 ( t ) and detailed coefficient of d 1 ( t ) , respectively.
Similarly, as s j 1 ( t ) V j 1 = V j W j we can divide it in subspace V j and W j :
s j 1 ( t ) = s j ( t ) + d j ( t )
where
s j ( t ) = k = v j , k ϕ j , k ( t ) = k = s j 1 ( t ) , ϕ j , k ( t ) ϕ j , k ( t )
d j ( t ) = k = w j , k ψ j , k ( t ) = k = s j 1 ( t ) , ψ j , k ( t ) ψ j , k ( t )
Therefore, we can continuously decompose the approximation components at different scales until the desired result is achieved. Ultimately, the reconstruction of the signal x ( t ) can be expressed as
x ( t ) = s 1 ( t ) + d 1 ( t ) = s 2 ( t ) + d 2 ( t ) + d 1 ( t ) = s J ( t ) + d J ( t ) + + d 1 ( t )
where J represents the total number of decomposition levels in wavelet decomposition. The detail of each decomposition layer d j ( t ) represents the fluctuation patterns at the corresponding time scale. As the number of decomposition layers increases, the fluctuation frequency gradually decreases. And the approximation s J ( t ) is the most gentle part of the change, representing the overall trend of the signal.
Wavelet decomposition and EMD-like methods are commonly used and effective multiscale time series decomposition methods [23]. Compared to classical decomposition, seasonal and trend decomposition using Loess (STL), and other methods that divide data series into trend, periodic, and random terms, while wavelet decomposition and EMD-like methods can decompose data into more sub-sequences with more diverse differences in period and frequency, which is convenient to excavate the time delay information more accurately. We utilize wavelet decomposition instead of EMD-like methods for multiscale decomposition for the following reasons:
(1)
Using wavelet transform to decompose time series can ensure that the period of the decomposed sequences at the same decomposition level is consistent, meaning that they have data fragments of the same length, reach the same frequency physically, and can naturally cooperate with wavelet coherence analysis to obtain the internal time delay under the corresponding scale. However, EMD and the methods derived from EMD cannot guarantee the same number of intrinsic mode functions (IMFs) decomposed from different variables, nor the consistent scale and frequency of IMFs at the same level of different variables, which will change over different local characteristics of the data. Therefore, it is difficult to extract more detailed intrinsic latency information between data when using EMD-like methods.
(2)
When different variables are decomposed using EMD, because of the data fluctuations, the number of IMFs obtained each time and the information contained in each IMF will change with the moving of the sliding window, thus generating additional dynamic information. The wavelet transform, however, uses the same type of wavelet for all the data in one decomposition operation. Therefore, all the changes of the information obtained by decomposition originate from the data itself, and the wavelet transform introduces no additional information.
(3)
In the process of adaptive signal decomposition, EMD-like methods need to perform cubic spline interpolation on the extreme value to obtain the upper and lower envelopes of the signal. Since the extreme points at both ends of the data sequence cannot be clearly identified and the conditions required for spline interpolation cannot be satisfied, the envelope line fitted will have a large swing at the endpoints that exceed the signal itself. This untrue swing will gradually spread to the signal center with the increase in decomposition level, resulting in serious distortion of the decomposition results, which is the so-called end effect. The use of wavelet decomposition can avoid the occurrence of this situation.

2.2. Delay Extraction: Wavelet Coherence Analysis

Industrial time series are usually nonstationary, which means that their frequency information changes over time. For these time series, it is important to perform correlation or coherence analysis in the time–frequency plane. Wavelet coherence analysis was proposed by Torrence and Compo [24], aiming to reveal the relationships between different time series in the time–frequency space. Wavelet coherence analysis can detect common temporal local oscillations in nonstationary signals. Additionally, the relative time delay between two series can be identified using the phase of wavelet cross spectrum if one is regarded to have an impact on the other. The cross wavelet transform of time series x ( t ) and y ( t ) is defined as
w j , k x y = w j , k x w j , k y
where w j , k x and w j , k y denote the wavelet transformation coefficient of x ( t ) and y ( t ) , respectively. The cross wavelet transform reveals the common power and relative phase of two time series in time–frequency space. Further, in order to capture the coherence between two series at low common power, wavelet coherence is defined as follows:
R j , k 2 = | S j 1 w j , k x y | 2 S j 1 | w j , k x | 2 · S j 1 | w j , k y | 2
where S is a smoothing operator in both time and scale. Grinsted et al. [25] rewrite the smoothing operator as
S ( w ) = S s c a l e ( S t i m e ( w ) )
where S s c a l e represents smoothing operation along wavelet scale axis and S t i m e is smoothing in time. Also, an appropriate smoothing operator is given for the Morlet wavelet:
S t i m e ( w j , k ) | j = ( w j , k c 1 k 2 2 j 2 ) | j
S s c a l e ( w j , k ) | k = w j , k c 2 Π ( 0.6 j ) | k
where c 1 and c 2 are normalization constants and Π is the rectangle function.
Figure 2 is a wavelet coherence diagram of blast temperature and PI, showing the inner relationship between the two series. The significance level of the area within the black outline is less than 0.05, which is a credible indication of causality. The shade of the colors in the diagram represents the magnitude of wavelet coherence. The closer it is to yellow, the higher the coherence of the time–frequency region, while the closer it is to blue, the lower the coherence. The arrow in the area with higher coherence indicates the phase difference between the two sequences. If the arrow points horizontally to the right, the two sequences are in the same phase; that is, there is no time delay. If the arrow points horizontally left, the two sequences are negatively correlated. If the arrow has an upward angle, it means that the first sequence is ahead. If the arrow has a downward angle, the second sequence is leading. Through the phase angle, the corresponding time delay can be calculated. From Figure 2 we can see that at all times within the period from 256 to 512, the blast temperature and PI exhibit high coherence, and the arrows are mostly horizontally to the left, indicating that the two series show anti-phase behavior at the corresponding period.

2.3. Determination of Delay Knowledge

According to the wavelet coherence analysis in Section 2.2, the coherence values R j , k 2 and phase difference φ j , k of the operating variables and PI at different decomposition scales and different time can be obtained. What we should do next is to convert them into specific time delays at certain scales, which can be used for variable alignment operation before the prediction task. We propose a method to achieve this goal. First, it is necessary to set a threshold R t h r e 2 to filter out the positions where the coherence is less than R t h r e 2 . Then, the average phase difference at a certain scale is obtained by weighted averaging the phase differences of all reserved positions at this scale according to coherence values, that is,
φ ¯ j = k = 1 K j R j , k 2 φ j , k k = 1 K j R j , k 2
Finally, the phase difference is converted into the corresponding time delay:
Δ t j = 2 j φ ¯ j 2 π
In addition, for the needs of practical industrial application, it is necessary to set a minimum advance time step Δ t m i n , which means we need to make predictions of Δ t m i n steps in advance. To meet the requirements, the time delay Δ t j is replaced with Δ t m i n if Δ t j < Δ t m i n , and retained in other cases, reaching the final time delay value.

3. Prediction Methodology

3.1. Spike Separation

An arbitrary spike signal similar to the spike of PI data caused by periodic equipment switching of the heated air subsystem is provided, with a total length of 128 and a non-zero region ranging from 54 to 76. The original signal and the results of six-layer wavelet decomposition are shown in Figure 3.
When applying Equation (11) to perform wavelet transforms at each level, selecting some specific values of the displacement factor k will result in the non-zero region of the wavelet function exactly covering a part of zero region of the original signal. At this point, performing inner product operation will gain a non-zero value, ultimately causing the transformed signal to diffuse outward relative to the original signal. Figure 3 shows the corresponding results.
The original PI signal can be decomposed into a sum of a relatively stationary signal and a spike signal:
x ( t ) = x s t a t i o n a r y ( t ) + x s p i k e ( t )
According to Equation (8), the detail coefficients of the original signal can be correspondingly represented as the sum of the detail coefficients of two component signals:
d 1 ( t ) = k = x ( t ) , ψ 1 , k ( t ) ψ 1 , k ( t ) = k = x s t a t i o n a r y ( t ) + x s p i k e ( t ) , ψ 1 , k ( t ) ψ 1 , k ( t ) = k = x s t a t i o n a r y ( t ) , ψ 1 , k ( t ) ψ 1 , k ( t ) + k = x s p i k e ( t ) , ψ 1 , k ( t ) ψ 1 , k ( t ) = d 1 , s t a t i o n a r y ( t ) + d 1 , s p i k e ( t )
Assuming that the amplitude of the spike fluctuation is about λ times the amplitude of the fluctuation of x ¯ s t a t i o n a r y ( t ) ( λ is an amplification factor), which is the result of removing the mean of x s t a t i o n a r y ( t ) , then we can derive
d 1 , s p i k e ( t ) = k = x s p i k e ( t ) , ψ 1 , k ( t ) ψ 1 , k ( t ) k = λ x ¯ s t a t i o n a r y ( t ) , ψ 1 , k ( t ) ψ 1 , k ( t ) = λ k = x ¯ s t a t i o n a r y ( t ) , ψ 1 , k ( t ) ψ 1 , k ( t ) = λ d 1 , s t a t i o n a r y ( t )
For the PI signal of the actual process, the value of λ is generally around 10, so the shape and amplitude of the wavelet decomposition result of the original signal are closer to the wavelet decomposition result of the spike signal, and the relatively stationary sequence is covered up. Compared with the information at the spike, the information from the relatively stable sequence is worthy of more attention, so we need to separate the spikes. The process of spike separation is shown in Figure 4, and the specific steps are as follows:
Step 1: Convolve the original signal to obtain the convolutional sequences x c o n v with a Gaussian function θ ( t ) = exp ( t 2 / 2 σ 2 ) / ( 2 π σ ) . The value of σ should be determined carefully. If σ is too small, the smoothing effect is poor, while large σ will lead to waveform distortion and phase shift. After experimental adjustment, σ = 2 is selected.
Step 2: Perform second-order difference on x c o n v and set an appropriate threshold y 1 t h r e to filter out the peak extreme points p. Then, perform first-order difference on x c o n v and set an appropriate interval to obtain the nearest extreme point q following each point p.
Step 3: Use the first-order difference of x c o n v as the search sequence. Set another threshold y 2 t h r e to locate the first point before point p which is less than y 2 t h r e as the starting point of the peak, and the first point after point q which is less than y 2 t h r e as the ending point of the peak.
Step 4: The sequence obtained by connecting the starting and ending points of the peak with smooth curves that can preserve the original peaks’ changing trend and greatly reduce the amplitude is the stationary sequence of the original signal, and the remaining part is the separated peaks, thereby achieving the goal of peak separation.
Figure 5 shows the comparison of PI series before and after spike separation. It can be seen that in the original series containing 22 hot blast stove switches, the operation of spike separation can effectively remove all peaks and maintain the stable changing trend of the sequence except for the peaks. The separated signal only contains the peaks caused by the switching of hot blast stove without any additional information.

3.2. Fusion Prediction Model

In this section, two algorithms, OSA and WNN, are applied and integrated through the Gauss–Markov estimation method to perform the prediction tasks. OSA is used to analyze the global time-domain information of data, while WNN can capture the local frequency-domain features more intensively. This fusion model can more comprehensively describe the nonlinearity and nonstationarity characteristics of PI.

3.2.1. Global Time-Domain: Orthonormal Subspace Analysis

Orthonormal Subspace Analysis (OSA) is a new statistical learning algorithm first used in fault monitoring tasks [26]. It proposes a subspace orthogonalization method that decomposes process data X and quality data Y into three orthonormal subspaces (as shown in Equation (23)), overcoming the defects of canonical correlation analysis (CCA) and partial least squares (PLS) in information leakage, model identification, and component selection. Now we will re-derive OSA to enable its application in prediction tasks:
X = T c o m Ξ X + E OSA = X ^ OSA + E OSA Y = T c o m Ξ Y + F OSA = Y ^ OSA + F OSA
where T c o m is the common latent variable shared by both X and Y , that is T c o m = X Ξ X = Y Ξ Y , and Ξ X and Ξ Y are the transformation matrices. E OSA and F OSA are the residual matrices of X and Y such that they do not contain any information of T c o m . Therefore, X and Y can be divided into the following three subspaces: (1) the common component subspace, i.e., X ^ OSA and Y ^ OSA , (2) the residual subspace of X , i.e., E OSA , and (3) the residual subspace of Y , i.e., F OSA . It can be proven that these three subspaces are mutually orthogonal. Therefore, X ^ OSA and Y ^ OSA can be written as follows:
X ^ OSA = T c o m Ξ X = Y Ξ Y Ξ X = Y Θ Y X ^ Y ^ OSA = T c o m Ξ Y = X Ξ X Ξ Y = X Θ X Y ^
where Θ X Y ^ represents the transformation matrix from X to Y ^ , and Θ Y X ^ represents the transformation matrix from Y to X ^ . Since E OSA is orthogonal to Y ^ OSA and F OSA , and F OSA is orthogonal to X ^ OSA and E OSA , we can obtain
Y T E OSA = 0 X T F OSA = 0 Y T ( X Y Θ Y X ^ ) = 0 X T ( Y X Θ X Y ^ ) = 0
Then all the required transformation matrices can be derived as follows:
Θ Y X ^ = ( Y T Y ) 1 Y T X Θ X Y ^ = ( X T X ) 1 X T Y Θ X X ^ = ( X T X ) 1 X T X ^ OSA
Θ X X ^ is necessary for the situation where new quality data y may not be available. Then principal components are extracted using Equation (27) as the observation variables in X ^ OSA may be highly linearly correlated:
X ^ OSA = V c o m U c o m T + E r e s V c o m = X ^ OSA U c o m
where V c o m denotes the score matrix, U c o m represents the loading matrix, and E r e s is the residual matrix. Accordingly, the transformation matrices from X to V c o m and from Y to V c o m can be derived as follows:
Θ X V = Θ X X ^ U c o m Θ Y V = Θ Y X ^ U c o m
Subsequently, the quality data Y n e w can be predicted with new process data X n e w and Equation (29):
Y p r e d i c t = X n e w Θ X V Θ Y V 1

3.2.2. Local Frequency-Domain: Wavelet Neural Network

The wavelet neural network (WNN) is a variant of BPNN [27]. It replaces the traditional activation functions used by hidden layer neurons in BPNN with the Morlet wavelet function as shown in the following equation:
ψ ( t ) = cos ( 1.75 t ) exp ( t 2 2 )
The narrow bandwidth of the Morlet wavelet function in frequency domain provides it with high-frequency resolution, endowing WNN with excellent capability to process high-frequency local nonstationary signals. Similar to Equation (5), the output value of hidden layer neurons is correspondingly calculated as
h ( n ) = ψ m = 1 M ω m n x m b n a n
where h ( n ) is the output value of the nth node in the hidden layer, ω m n denotes weights connecting the input layer and hidden layer, x m is the output value of the mth node in the input layer, and a n and b n are the scale factor and displacement factor of the nth node in the hidden layer, respectively.
WNN inherits the advantages of wavelet transform and BPNN. Through introducing scale factor a n and displacement factor b n , which are adjusted by adopting the same error back propagation algorithm as neural network weights, WNN can perform multiscale analysis on signals using time windows of different widths and positions, thereby highlighting the details of the problem to be processed, effectively extracting local information of the signals, and avoiding nonlinear optimization problems such as local optima [28].

3.2.3. Model Fusion: Gauss–Markov Estimation

After obtaining the prediction results of OSA and WNN models at different scales, fusing them to obtain the final model output is necessary. The traditional method assigns the same weight to different submodels, which can lead to error accumulation. The improvement in ultimate prediction accuracy is achieved through model fusion using Gauss–Markov estimation. Gauss–Markov estimation is essentially a special form of weighted least squares method [29]. By fusing the error variances of different models, the error variance of the final output result is brought smaller than that of each single submodel.
Assume that there are C prediction submodels. The estimated output of the ith submodel is y ^ i . The corresponding prediction error and variance are v i and σ i 2 , respectively. And the actual value is y. Hence, the relationship between y ^ i and y can be expressed as y ^ i = y + v i . Gauss–Markov theorem assumes that all errors v i satisfy the conditions of independence from each other and zero mean. For minimizing the error variance of the estimated final output y ^ , the weighted variance estimation index is constructed as
J = Y ^ G y ^ T R 1 Y ^ G y ^
where Y ^ = [ y ^ 1 y ^ C ] T , G is a C × 1 vector with all elements equal to 1, and R = d i a g ( σ 1 2 σ C 2 ) . The estimated value y ^ can be calculated as
y ^ = G T R 1 G 1 G T R 1 Y ^ = i = 1 C 1 σ i 2 1 i = 1 C 1 σ i 2 y ^ i
and the corresponding error variance R ^ can be derived as
R ^ = E y ^ y 2 = G T R 1 G 1 G T R 1 Y ^ G y Y ^ G y T R 1 G G T R 1 G 1 = G T R 1 G 1 G T R 1 R R 1 G G T R 1 G 1 = G T R 1 G 1 = i = 1 C 1 σ i 2 1 < σ i 2 , i = 1 , 2 , , C
Equation (34) shows that Gauss–Markov estimation can reduce the error variance of the final output results in the process of model integration and thus improve the prediction performance of the overall model.

3.3. The Proposed Prediction Framework

The basic architecture of our proposed PI prediction model is shown in Figure 6. The modeling process is as follows:
Step 1: Determine the input and output variables of the model based on the operational mechanism of BFP.
Step 2: Perform de-noising and spike separation on all variables to obtain their respective stationary and peak parts.
Step 3: Select the appropriate number of wavelet decomposition layers n, then apply wavelet decomposition to the stationary part of each variable to get n detail components and one approximation component.
Step 4: At each decomposition level, extract the decomposition sequences of all variables belonging to that decomposition level, align these sequences based on the time delay Δ t j received from wavelet coherence analysis. Then send the aligned data into OSA and WNN to conduct model training and fusion by Gauss–Markov estimation. Perform the same operation on the peak parts separated in Step 2.
Step 5: Utilize the well-trained model to conduct prediction tasks at different scales with new input data and sum up all the outputs to gain the final predicted value of PI.
The selection of wavelet decomposition levels is crucial in this model. Due to the fact that signal x ( t ) can be decomposed into the form of Equation (12), considering the orthonormality of wavelets ϕ j , k ( t ) and ψ j , k ( t ) , according to Parseval’s theorem, it can be obtained that
x ( t ) 2 = s J ( t ) 2 + j = 1 J d j ( t ) 2
Therefore, d j ( t ) 2 represents the energy contribution of changes on scale τ j to x ( t ) . d j ( t ) 2 at different decomposition levels are calculated to obtain their respective energy ratios as shown in Figure 7. It can be seen that after decomposing to the ninth layer, the energy contribution of corresponding detail coefficients significantly decreases. There is no significance for further decomposition and analysis of them. So the wavelet decomposition layer is selected as 9. Figure 7 also illustrates that after spike separation, the highest energy ratio decreases, and the energy aggregation phenomenon of the detail coefficients of each layer weakens, indicating that this operation effectively separates the impact of the hot blast subsystem switching from the normal operation sequence. In addition, the commonly used Daubechies wavelets are chosen, which have good regularity, compact support, and symmetry.
The selection of other important model parameters, including the number of hidden layer units m in WNN and the wavelet coherence threshold R t h r e 2 , will be discussed in detail in Section 4.

4. Experimental Studies

This section will discuss the results of implementing our proposed model for PI prediction tasks on field data from actual BFP. We gathered actual operation data of the #2 blast furnace of Liuzhou Steel Group Co., Ltd. in Guangxi Province, China, in March 2021. Based on the operational mechanism analysis of BFP in Section 1, available observation variables with different impact time scales that are most relevant to PI are selected as the input variables of the model, including cold blast flow rate (CBFR), oxygen enrichment rate (OER), blast kinetic energy (BKE), bosh gas index (BGI), total pressure drop (TPD), hot blast pressure (HBP), hot blast temperature (HBT), and top temperature (TT). They belong to different operating systems of the blast furnace, covering impact time scales from seconds to hours. These process data are stored in SQL Server, collected on-site at a frequency of 10 s. PI is the output variable of the model. A total of 8192 data points are used for the simulation experiments, of which 80% are used for training the prediction model and the rest are test data. All variables are de-noised using Daubechies 4 wavelet and normalized before being fed into the model. The detailed information of the observation data is shown in Table 2.
To demonstrate the modeling effectiveness of our method, root mean square error (RMSE), correlation coefficient (r), and determination coefficient ( R 2 ) are used as evaluation criteria for model performance. RMSE is used to measure the average deviation between prediction values and actual values. The closer RMSE is to 0, the more accurate the prediction model is. r describes the degree of linear correlation between the prediction values and the actual values, while the R 2 reflects the model’s goodness of fit. The closer r and R 2 are to 1, the closer the prediction values are to the actual values in general. The calculation formulas for RMSE, r, and R 2 are as follows:
RMSE = 1 n i = 1 n ( y i y ^ i ) 2
r = i = 1 n ( y i y ¯ ) ( y ^ i y ^ ¯ ) i = 1 n ( y i y ¯ ) 2 i = 1 n ( y ^ i y ^ ¯ ) 2
R 2 = 1 i = 1 n ( y i y ^ i ) 2 i = 1 n ( y i y ¯ ) 2

4.1. Time Delay Analysis

Table 3 compares the time delays between 8 input variables and PI calculated using our method, Pearson correlation coefficient (PCC), and mutual information (MI) method at different wavelet decomposition levels. Different operating variables act on different parts of the blast furnace and involve different physical or chemical processes, each with its characteristic time scale. Since there are no actual measured values for the time delays, we conduct a qualitative analysis based on practical conditions and expert knowledge to elucidate the physical interpretation of the extracted multiscale time delays.
The smaller numbers in the table (below 50) represent shorter delays of a few minutes. These are mainly related to the blast furnace air supply system and the instantaneous dynamics of the gas. Variables such as CBFR, BKE, BGI, and HBP directly affect the distribution and pressure of the ascending gas. A change in these parameters can almost instantaneously alter the gas flow pattern and pressure drop across the cohesive zone and dry zone, impacting PI within minutes. This is analogous to changing the pressure in a pneumatic pipe, where the effect is nearly immediate.
The larger numbers in the table (above 50) represent longer delays ranging from tens of minutes to hours. These delays correspond to the thermal and chemical processes governing the softening and melting zone. Operational variables like OER and HBT modify the adiabatic flame temperature and the size of combustion zone. This, in turn, alters the shape, thickness, and location of the soft melting zone—the primary resistance to gas flow in the furnace. However, this thermal inertia and the subsequent change in the melting zone’s structure require a longer time to manifest fully in the overall permeability.
With this physical context, the results in Table 3 can be meaningfully interpreted. As a calculated variable, PI should fall behind all variables collected from actual sites in terms of trend and phase. In our method, all delays are positive values, meaning the input variables lead PI, which aligns with physical causality. In contrast, the mixed positive and negative delays from PCC and MI are physically implausible. For different variables, noise dominates the first two wavelet decomposition sequences. The noise of random fluctuations does not have a relative delay, and the same conclusion is reflected in the results obtained by our method. However, PCC and MI exhibit inappropriate large-time delays in the first two wavelet decomposition levels, which contradicts common sense.
Furthermore, our results are consistent with the expected impact scales. Variables related to the gas supply (CBFR, BKE, BGI, TPD, and HBP) show significant delays within ten minutes (e.g., at levels 3–7), while those influencing the thermal state (OER, HBT, and TT) exhibit major delays in the 30 min to 1 h range (e.g., at levels 8–9). This clear separation of influence timescales is absent in the PCC and MI results, which show erratic and counter-intuitive delays (e.g., extremely large values at low decomposition levels dominated by noise). In summary, the multiscale time delays extracted by our proposed method provide a physically meaningful and accurate representation of the diverse dynamic processes inside the blast furnace.

4.2. Prediction Accuracy Analysis

To rigorously assess the performance of our model and validate its superiority over existing approaches, we conduct four sets of experiments using different methods for comprehensive comparison with our model. Table 4 shows the parameter settings of each model and the PI prediction results at different minimum advance time steps Δ t m i n . Group A contains several machine learning methods without neural networks, Group B contains several commonly used neural network methods, Group C contains published methods for PI prediction, and Group D is an ablation study of our model. The above results are averaged over 20 experiments.
To ensure a fair comparison across all benchmark methods, optimal model parameters are determined through a combination of literature-guided initialization and data-driven cross-validation. For traditional machine learning models in Group A, the optimal hyperparameters are determined via 5-fold cross-validation on the training set. For PLS and OSA, the number of latent variables is determined by the 90% Cumulative Percent Variance (CPV) criterion, a standard threshold to retain sufficient process information while avoiding overfitting. For KPLS, a polynomial kernel is adopted, with the order optimized to 2 and the constant term to 2 via 5-fold cross-validation. For LS-SVM, the kernel parameter γ for the RBF kernel is set to 10, calibrated to minimize the prediction error during cross-validation. For neural network-based models in Group B, a combination of grid search and hold-out validation is employed. The number of hidden units for GRU and LSTM is optimized to 32, and the learning rate is set to 0.001. The network structure for MLP is set to [8, 4] through cross-validation. The parameters for methods from the literature in Group C are set according to their original publications, with any unspecified parameters optimized using a compatible strategy. The final hyperparameter configurations for all models, which yield the best performance on the validation set, are summarized in Table 4.
In the case of Δ t m i n = 0 , PI is equal to CBFR divided by TPD at the same acquisition time. This calculation process does not exhibit strong nonlinearity. Therefore, linear methods PLS and OSA both demonstrate relatively good prediction performance. Due to the ability to forcibly partition mutually orthogonal output-related components subspaces and residual subspaces, OSA overcomes the problem of information leaking into the residual space, reaching better prediction performance than PLS. Compared with RBF kernel, KPLS with polynomial kernel obtains better performance, indicating the low dimension of PI data in this case, so that using infinite dimensional Gaussian kernels is more prone to overfitting. The slight performance improvement of KPLS compared to PLS also indicates this point. The LS-SVM method also has the problem of overfitting (RMSE = 0.042 for the training set and RMSE = 0.735 for the test set) due to its higher suitability for small sample data and the distribution of our dataset.
Results of neural network methods show better experimental performance. As a single-layer feedforward neural network, introducing kernel functions improves the ability of ELM to handle nonlinearity, but overfitting is prone to occur for weak nonlinear scenarios. By contrast, the simple MLP model obtains better results. Deep learning models GRU and LSTM achieve the best prediction performance since the addition of recurrent cells improves the ability to process long-time series prediction tasks.
For Group C, since the original literature did not consider the situation of hot blast stove switching, our data show stronger nonstationarity than theirs, and their methods cannot perform the best on our dataset. W-LS-SVM [17] has the same overfitting problem as LS-SVM. Even if wavelet decomposition is applied to improve the ability of multiscale analysis, its performance is still very poor. The two improved methods based on the ELM model have similar effects. Compared to w-PCA-ML-ELM [18], adding the ALD criterion and sliding windows for data filtering improves the prediction performance of ALD-KOS-ELMF [30] slightly. Since WNN only changes the activation function compared to ANN, the performance of CM-WNN [19] with a single hidden layer is not as good as MLP with multiple hidden layers. Due to the addition of modal decomposition and optimization to the MLP model, the performance of VMD-PSO-BP [20] is the best in Group C.
For Group D, we compare the effect of our method and the models that remove any one module from the complete model. It can be seen that each module contributes to the effectiveness improvement of the model. Peak separation reduces the nonstationarity of data and enhances the effect of wavelet decomposition, thereby improving prediction performance. OSA and WNN supplement the prediction effect of the model from both linear and nonlinear aspects. The delay analysis module introduces additional intrinsic delay information, improves the interpretability and robustness of the model, and also enhances the prediction performance. However, compared to our models and those with better performance such as GRU, LSTM and OSA, the information contained in the input variables is already rich and complete enough when Δ t m i n = 0 . In this case, the use of the wavelet decomposition method will actually destroy the original structure and information implied in the data, leading to a decrease in prediction performance.
However, the situation changes when Δ t m i n 0 . In Table 4, we list the experimental results with Δ t m i n = 5 and 10. In this case, CBFR, TPD, and PI to be predicted no longer have a clear relationship. Moreover, due to the small impact time scale of the air supply system, once the time delay exists, the correlation between variables will obviously decrease. Therefore, it can be seen that RMSE in the experimental results of Group A, B, and C all significantly rises with the increase in delay. In contrast, GRU has the best effect among them. When Δ t m i n changes from 0 to 10, the RMSE only increases by 0.241, and the R 2 and r values also maintain a high level. However, when Δ t m i n = 5 and 10, our method has the best performance in the three indicators of RMSE, R 2 , and r, and the increase in RMSE is also the least (only 0.109), which reflects the applicability of our method in scenarios that require multi-step advance prediction. It is observed that the RMSE difference between the with and without delay analysis module gradually increases (from 0.032 to 0.038 and then to 0.072), indicating that the correlation between variables weakens with the increase in Δ t m i n , and the inherent delay information of long time scales obtained by using delay analysis module at different decomposition scales could compensate for the decline in correlation and slow down the reduction in prediction effect. Figure 8 compares the variation of RMSE with Δ t m i n for different methods. It can be intuitively seen that except for GRU and our method, the slope in the second half of other methods is larger than that in the first half, illustrating that the model effect deteriorates to a greater extent when the time delay is larger. On the contrary, our model shows a nearly straight line with the smallest slope in the graph, meaning that our model maintains good performance under different time delays. When Δ t m i n increases, the performance of other models declines significantly because they lack an understanding of the inherent causal time lags between variables. In contrast, our model, by explicitly leveraging multiscale time delay information, can ’anticipate’ changes caused by slow processes (such as cohesive zone variations) earlier. As a result, its prediction performance decays the slowest, which is crucial in industrial scenarios that require long-term early warning.
A more in-depth comparison between our method and the GRU method, which has the best overall prediction performance among other methods, is shown in Figure 9. In the stage where the data trend is relatively stable (such as the acquisition time periods of 600–800 and 1000–1200), GRU can show good fitting results at different time delays. However, when the data are not stable enough (such as the region with purple shadows in the figure), there exist significant fluctuations in the actual data, and in the case of large Δ t m i n , multi-resolution information is hidden in the data. With GRU without prior knowledge guidance, it is difficult to mine this information, which is susceptible to other irrelevant information, resulting in overfitting and worsening the prediction results. On the contrary, thanks to the multi-resolution analysis and multi-time-scale analysis capabilities, our method can predict the trend of PI changes well at any time delay, even in periods when the process changes are not stationary enough.
A key concern for industrial models is their performance across varying operating conditions. Although developed on data from one blast furnace, our model is rigorously tested against inherent regime shifts on datasets from different time periods. We provide the prediction performance of the proposed model on another production dataset from August 2021, as shown in Figure 10. Our model’s prediction stability shown in Figure 9 and Figure 10 demonstrates its robustness under different operating modes. This robustness stems from our methodology’s core focus on multiscale delay extraction and time–frequency domain modeling, which are universal challenges in blast furnace operation. Therefore, the model exhibits strong potential for transferability to other furnaces, as the methodology is designed to learn the specific dynamics of any given dataset.

4.3. Model Parameter Selection

We study the prediction performance of our model with different values of the number of hidden layer units m in WNN and the wavelet correlation threshold R t h r e 2 described in Section 3.3 at Δ t m i n = 0 , as shown in Figure 11. As m increases, the prediction performance shows a trend of initially decreasing and then increasing, reaching its lowest point at m = 11 . It is easy to understand that an initial increase in m improves the fitting ability of neural networks, but when m exceeds a certain value, the increase in network complexity results in overfitting the decrease in prediction performance on the test set. Therefore, m is set to 11. With the increase in R t h r e 2 , the decrease in RMSE goes through three plateau periods (0.5–0.6, 0.65–0.85, and 0.9–0.95), reaching its minimum at R t h r e 2 = 0.95 . During this process, excess information is constantly filtered, leaving behind information more relevant to PI and more accurate intrinsic time delays. In practical applications, we find that for different Δ t m i n , the results are not significantly different when R t h r e 2 values of 0.9 and 0.95 are selected. Sometimes choosing R t h r e 2 = 0.9 works better, and sometimes choosing R t h r e 2 = 0.95 works better. Here, we mainly choose R t h r e 2 = 0.95 for the experiment.

4.4. Computational Efficiency Analysis

In order to analyze the computational efficiency of our model, we compare both the training time and inference time of models with better prediction performance under Intel(R) Core(TM) i9-10900K CPU and 32G RAM as shown in Table 5. The training time of our model is 30.70 s, which is competitive compared to other complex models. Because our model is based on OSA with WNN and reduces the hidden layer nodes of WNN, even if nine wavelet layers are selected, the computation time of our model is still well controlled. More critically for practical application, the inference time of our model on the test set containing 1638 data points is 35.74 milliseconds. This high prediction speed is a pivotal factor for real-time industrial deployment. Given that the data acquisition frequency in our industrial case is 10 s, our model can complete a prediction in less than 0.36% of the sampling interval. This demonstrates a strong capability for real-time monitoring and provides ample computational headroom for timely operational adjustments.
In contrast, while models like OSA and w-PCA-ML-ELM exhibit lower inference times, their prediction accuracy, especially for multi-step ahead forecasting as demonstrated in Table 4, is significantly inferior to our method. On the other hand, deep learning models such as GRU and LSTM, despite their high accuracy, suffer from excessively high inference times (over 3.8 s), which could become a bottleneck in a continuous real-time monitoring system. Therefore, the proposed framework achieves an excellent balance between prediction accuracy and computational efficiency, making it highly feasible for real-time deployment in industrial environments where both timeliness and precision are important.

5. Conclusions

Advance PI prediction plays an important role in monitoring and adjusting the internal production status of blast furnaces. In this article, a PI prediction model that can explore the inherent time delay characteristics between variables is proposed. Firstly, based on the proposed novel delay extraction approach, the process dynamics between observed variables and PI are focused on extracting time delay information at different time scales for the first time, overcoming the limitation of traditional correlation analysis in noisy environments. Secondly, to handle the data nonstationarity and nonlinearity, OSA is transferred to prediction tasks and integrated with WNN to provide time–frequency multi-resolution feature extraction capability at both global and local scales. Through the collaborative framework based on Gauss–Markov estimation between different modules, the risk of overfitting is reduced, ensuring the reliability and accuracy of the model. Finally, the effectiveness of the proposed model in extracting the time delay characteristics between variables is demonstrated through mechanism analysis and practical blast furnace studies. Compared with traditional machine learning and deep learning methods, the proposed model provides the best advance prediction performance for PI. One limitation of this study is the exclusion of raw material properties, which could introduce prediction deviations during quality fluctuations. Future research will focus on incorporating these data to enhance the comprehensiveness of the model, and continuously optimize training time and prediction performance for zero advance time. This work will also be expanded to other more valuable fields. A key next step is developing an online learning version of the framework to enable adaptive model updating, which is crucial for handling slow process drifts and ensuring long-term practical utility. Furthermore, the proposed methodology, with its core components of multiscale delay extraction and hybrid modeling, demonstrates strong generality. It holds significant potential for being applied to predict other critical blast furnace indices, such as silicon content in hot metal, thereby facilitating a more comprehensive and intelligent operational system.

Author Contributions

Conceptualization, Y.X. and C.Y.; methodology, Y.X.; software, Y.X.; validation, Y.X. and S.L.; formal analysis, Y.X.; investigation, Y.X. and S.L.; resources, Y.X. and C.Y.; data curation, Y.X. and S.L.; writing—original draft preparation, Y.X.; writing—review and editing, C.Y. and S.L.; visualization, Y.X.; supervision, C.Y. and S.L.; project administration, C.Y.; funding acquisition, C.Y. All authors have read and agreed to the published version of the manuscript.

Funding

Authors gratefully acknowledge the support by Pioneer Research and Development Program of Zhejiang (No. 2025C01021), Zhejiang Province Postdoctoral Research Project Selection Fund (No. ZJ2025061), National Natural Science Foundation of China (No. 62394341) and the Fundamental Research Funds for the Central Universities (No. 226202400182).

Data Availability Statement

The original contributions presented in this study are included in this article. Further inquiries can be directed to the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Zhang, X.; Kano, M.; Matsuzaki, S. A comparative study of deep and shallow predictive techniques for hot metal temperature prediction in blast furnace ironmaking. Comput. Chem. Eng. 2019, 130, 106575. [Google Scholar] [CrossRef]
  2. Zhou, P.; Li, W.; Wang, H.; Li, M.; Chai, T. Robust online sequential RVFLNs for data modeling of dynamic time-varying systems with application of an ironmaking blast furnace. IEEE Trans. Cybern. 2019, 50, 4783–4795. [Google Scholar] [CrossRef]
  3. Li, J.; Yang, C.; Li, Y.; Xie, S. A context-aware enhanced GRU network with feature-temporal attention for prediction of silicon content in hot metal. IEEE Trans. Ind. Inform. 2021, 18, 6631–6641. [Google Scholar] [CrossRef]
  4. Jiang, D.; Wang, Z.; Li, K.; Zhang, J.; Zhang, S. Machine Learning Models for Predicting and Controlling the Pressure Difference of Blast Furnace. JOM 2023, 75, 4550–4561. [Google Scholar] [CrossRef]
  5. Pavlov, A.; Onorin, O.; Spirin, N.; Polinov, A. MMK blast furnace operation with a high proportion of pellets in a charge. Part 1. Metallurgist 2016, 60, 581–588. [Google Scholar] [CrossRef]
  6. Das, K.; Agrawal, A.; Reddy, A.; Ramna, R. Factsage studies to identify the optimum slag regime for blast furnace operation. Trans. Indian Inst. Met. 2021, 74, 419–428. [Google Scholar] [CrossRef]
  7. Tonkikh, D.; Karikov, S.; Tarakanov, A.; Koval’chik, R.; Kostomarov, A. Improving the charging and blast regimes on blast furnaces at the Azovstal metallurgical combine. Metallurgist 2014, 57, 797–803. [Google Scholar] [CrossRef]
  8. Li, W.; An, J.; Chen, X.; Wang, Q. Multi-Time Scale Analysis of The Influence of Blast Furnace Operations on Permeability Index. In Proceedings of the 2022 41st Chinese Control Conference (CCC), Hefei, China, 25–27 July 2022; pp. 2574–2579. [Google Scholar] [CrossRef]
  9. Huang, X.; Yang, C. Pretrained Language–Knowledge Graph Model Benefits Both Knowledge Graph Completion and Industrial Tasks: Taking the Blast Furnace Ironmaking Process as an Example. Electronics 2024, 13, 845. [Google Scholar] [CrossRef]
  10. Zhang, H.; Shang, J.; Zhang, J.; Yang, C. Nonstationary process monitoring for blast furnaces based on consistent trend feature analysis. IEEE Trans. Control Syst. Technol. 2021, 30, 1257–1267. [Google Scholar] [CrossRef]
  11. Abhale, P.B.; Viswanathan, N.N.; Saxén, H. Numerical modelling of blast furnace–Evolution and recent trends. Miner. Process Extr. Metall. 2020, 129, 166–183. [Google Scholar] [CrossRef]
  12. Zhou, C.; Tang, G.; Wang, J.; Fu, D.; Okosun, T.; Silaen, A.; Wu, B. Comprehensive numerical modeling of the blast furnace ironmaking process. JOM 2016, 68, 1353–1362. [Google Scholar] [CrossRef]
  13. Li, J.; Zhu, R.; Zhou, P.; Song, Y.p.; Zhou, C.Q. Prediction of the cohesive zone in a blast furnace by integrating CFD and SVM modelling. Ironmak. Steelmak. 2021, 48, 284–291. [Google Scholar] [CrossRef]
  14. Roeplal, R.; Pang, Y.; Adema, A.; van der Stel, J.; Schott, D. Modelling of phenomena affecting blast furnace burden permeability using the Discrete Element Method (DEM)—A review. Powder Technol. 2023, 415, 118161. [Google Scholar] [CrossRef]
  15. Santana, E.R.; Pozzetti, G.; Peters, B. Application of a dual-grid multiscale CFD-DEM coupling method to model the raceway dynamics in packed bed reactors. Chem. Eng. Sci. 2019, 205, 46–57. [Google Scholar] [CrossRef]
  16. Luo, Y.; Zhang, X.; Kano, M.; Deng, L.; Yang, C.; Song, Z. Data-driven soft sensors in blast furnace ironmaking: A survey. Front. Inform. Technol. Elect. Eng. 2023, 24, 327–354. [Google Scholar] [CrossRef]
  17. Liang, D.; Bai, C.; Shi, H.; Dong, J. Research on intellectual prediction for permeability index of blast furnace. In Proceedings of the 2009 WRI Global Congress on Intelligent Systems, Xiamen, China, 19–21 May 2009; pp. 299–303. [Google Scholar] [CrossRef]
  18. Su, X.; Zhang, S.; Yin, Y.; Xiao, W. Prediction model of permeability index for blast furnace based on the improved multi-layer extreme learning machine and wavelet transform. J. Frankl. Inst. 2018, 355, 1663–1691. [Google Scholar] [CrossRef]
  19. Tan, K.; Li, Z.; Han, Y.; Qi, X.; Wang, W. Research and Application of Coupled Mechanism and Data-Driven Prediction of Blast Furnace Permeability Index. Appl. Sci. 2023, 13, 9556. [Google Scholar] [CrossRef]
  20. Liu, X.; Zhang, Y.; Li, X.; Zhang, Z.; Li, H.; Liu, R.; Chen, S. Prediction for permeability index of blast furnace based on VMD–PSO–BP model. J. Iron Steel Res. Int. 2024, 31, 573–583. [Google Scholar] [CrossRef]
  21. Diniz, A.P.M.; Côco, K.F.; Gomes, F.S.V.; Salles, J.L.F. Forecasting model of silicon content in molten iron using wavelet decomposition and artificial neural networks. Metals 2021, 11, 1001. [Google Scholar] [CrossRef]
  22. Guo, T.; Zhang, T.; Lim, E.; Lopez-Benitez, M.; Ma, F.; Yu, L. A review of wavelet analysis and its applications: Challenges and opportunities. IEEE Access 2022, 10, 58869–58903. [Google Scholar] [CrossRef]
  23. Zhang, L.; Tan, H.; Wang, Z. Interference response prediction of receiver based on wavelet transform and a temporal convolution network. Electronics 2023, 13, 162. [Google Scholar] [CrossRef]
  24. Torrence, C.; Compo, G.P. A practical guide to wavelet analysis. Bull. Am. Meteorol. Soc. 1998, 79, 61–78. [Google Scholar] [CrossRef]
  25. Grinsted, A.; Moore, J.C.; Jevrejeva, S. Application of the cross wavelet transform and wavelet coherence to geophysical time series. Nonlinear Process Geophys. 2004, 11, 561–566. [Google Scholar] [CrossRef]
  26. Lou, Z.; Wang, Y.; Si, Y.; Lu, S. A novel multivariate statistical process monitoring algorithm: Orthonormal subspace analysis. Automatica 2022, 138, 110148. [Google Scholar] [CrossRef]
  27. Gao, X. A comparative research on wavelet neural networks. In Proceedings of the 9th International Conference on Neural Information Processing, Singapore, 18–22 November 2002; pp. 1699–1703. [Google Scholar] [CrossRef]
  28. Khalifa, A.; Yadav, Y. Wavelet-Based Fusion for Image Steganography Using Deep Convolutional Neural Networks. Electronics 2025, 14, 2758. [Google Scholar] [CrossRef]
  29. Buonocore, A.; Caputo, L.; Nobile, A.G.; Pirozzi, E. Gauss–Markov processes in the presence of a reflecting boundary and applications in neuronal models. Appl. Math. Comput. 2014, 232, 799–809. [Google Scholar] [CrossRef]
  30. Liu, S.-X.; Zhang, S.; Sun, S.-L.; Yin, Y.-X. A novel permeability index prediction method for blast furnace based on the improved extreme learning machine. In Proceedings of the 2022 37th Youth Academic Annual Conference of Chinese Association of Automation (YAC), Beijing, China, 19–20 November 2022; pp. 1–6. [Google Scholar] [CrossRef]
Figure 1. Schematic of the blast furnace production (BFP) and its nonstationary data characteristics. (a) The basic operating systems of BFP, including the burden charging, blast supply, thermal, and slag systems. (b) Time-series profiles of normalized process variables, including (1) total pressure drop, (2) hot blast temperature, (3) top temperature, (4) hot blast pressure, (5) oxygen enrichment rate, (6) permeability index, and (7) cold blast flow rate.
Figure 1. Schematic of the blast furnace production (BFP) and its nonstationary data characteristics. (a) The basic operating systems of BFP, including the burden charging, blast supply, thermal, and slag systems. (b) Time-series profiles of normalized process variables, including (1) total pressure drop, (2) hot blast temperature, (3) top temperature, (4) hot blast pressure, (5) oxygen enrichment rate, (6) permeability index, and (7) cold blast flow rate.
Electronics 14 04670 g001
Figure 2. Wavelet coherence between blast temperature and permeability index series.
Figure 2. Wavelet coherence between blast temperature and permeability index series.
Electronics 14 04670 g002
Figure 3. Wavelet analysis of spike signal.
Figure 3. Wavelet analysis of spike signal.
Electronics 14 04670 g003
Figure 4. Schematic diagram of spike separation process.
Figure 4. Schematic diagram of spike separation process.
Electronics 14 04670 g004
Figure 5. Permeability index (a) before spike separation, (b) after spike separation, and (c) separated peak signal.
Figure 5. Permeability index (a) before spike separation, (b) after spike separation, and (c) separated peak signal.
Electronics 14 04670 g005
Figure 6. Basic architecture of the proposed method.
Figure 6. Basic architecture of the proposed method.
Electronics 14 04670 g006
Figure 7. Energy ratio of different decomposition level of detail coefficients for (a) original permeability index signal and (b) permeability index signal processed by spike separation.
Figure 7. Energy ratio of different decomposition level of detail coefficients for (a) original permeability index signal and (b) permeability index signal processed by spike separation.
Electronics 14 04670 g007
Figure 8. Evolution of prediction accuracy for multi-step ahead forecasting. The RMSE values for the permeability index of different models are shown as a function of minimum advance time step Δ t m i n .
Figure 8. Evolution of prediction accuracy for multi-step ahead forecasting. The RMSE values for the permeability index of different models are shown as a function of minimum advance time step Δ t m i n .
Electronics 14 04670 g008
Figure 9. Comparison of prediction details between the proposed method and the GRU model under different Δ t m i n . (ac) show the predictions of our method for Δ t m i n = 0, 5, and 10, respectively. (df) show the corresponding predictions of the GRU model.
Figure 9. Comparison of prediction details between the proposed method and the GRU model under different Δ t m i n . (ac) show the predictions of our method for Δ t m i n = 0, 5, and 10, respectively. (df) show the corresponding predictions of the GRU model.
Electronics 14 04670 g009
Figure 10. Prediction details of the proposed method under different Δ t m i n using data from August 2021. (ac) show the predictions of our method for Δ t m i n = 0, 5, and 10, respectively.
Figure 10. Prediction details of the proposed method under different Δ t m i n using data from August 2021. (ac) show the predictions of our method for Δ t m i n = 0, 5, and 10, respectively.
Electronics 14 04670 g010
Figure 11. Prediction performance of our method at 0 time delay varies with (a) the number of hidden layer units m in wavelet neural network (WNN) and (b) the wavelet coherence threshold R t h r e 2 .
Figure 11. Prediction performance of our method at 0 time delay varies with (a) the number of hidden layer units m in wavelet neural network (WNN) and (b) the wavelet coherence threshold R t h r e 2 .
Electronics 14 04670 g011
Table 1. Comparison of the proposed method with previous hybrid or multiscale models for permeability index prediction.
Table 1. Comparison of the proposed method with previous hybrid or multiscale models for permeability index prediction.
Feature/MethodDong et al. [17]Su et al. [18]Tan et al. [19]Liu et al. [20]Proposed Method
Multiscale DecompositionWaveletWaveletWaveletVMDWavelet
Time Delay ExcavationNot ConsideredNot ConsideredMIC and SpearmanNot ConsideredWavelet Coherence Analysis
Spike SeparationNoNoNoNoYes
Prediction ModelLS-SVMML-ELMWNNBPNNOSA and WNN Fusion
Multi-step Ahead FocusLimitedLimitedLimitedLimitedExplicitly Designed
Table 2. Detailed information of the collected observation data.
Table 2. Detailed information of the collected observation data.
TypeNo.ParameterMaximumMinimumAverageUnit
Input1Cold blast flow rate (CBFR)35.26432.51733.540 m 3 / s
2Oxygen enrichment rate (OER)4.25783.65764.0351%
3Blast kinetic energy (BKE)195118148 kJ / s
4Bosh gas index (BGI)85.08078.72081.105 m / min
5Total pressure drop (TPD)226.70150.50200.75 kPa
6Hot blast pressure (HBP)450.70373.50423.60 kPa
7Hot blast temperature (HBT)1141.7939.31071.4°C
8Top temperature (TT)322.7159.8218.2°C
Output1Permeability index (PI)22.94414.46716.696 m 3 / ( atm · s )
Table 3. Time delay between eight input variables and permeability index at different wavelet decomposition levels calculated by three methods.
Table 3. Time delay between eight input variables and permeability index at different wavelet decomposition levels calculated by three methods.
MethodInput VariableWavelet Decomposition Layer
123456789
OursCBFR00003817189
OER00028162253161
BKE001012124168
BGI000238161848
TPD002361766114
HBP0013516146862
HBT0000313657151
TT0023112757141
PCCCBFR638000−2−20−45
OER020−6321−1−4126775−492
BKE00000−10244247
BGI−906−2203216−2−20−514
TPD00000000−1
HBP1944160000−1−30
HBT0−432−80046523412500
TT6−41−160−35−1128635−515
MICBFR0000−20−1−11
OER0−4961−1−2−4−413
BKE00000001−38
BGI−16−56−6416−2−1213
TPD000000000
HBP−123000000−21
HBT0−72032000310
TT640−96−32002596
Table 4. Prediction results of permeability index from different methods.
Table 4. Prediction results of permeability index from different methods.
MethodParameter Δ t m i n = 0 Δ t m i n = 5 Δ t m i n = 10
RMSE R 2 rRMSE R 2 rRMSE R 2 r
APLSLatent variables: 40.1900.9590.9820.3790.8260.9150.6490.4880.714
KPLSLatent variables: 4; Polynomial kernel; Kernel parameter: α : 2, c: 20.1810.9600.9810.4030.8030.8970.6940.4160.686
LS-SVMRBF kernel; Kernel parameter: γ : 100.7350.3440.5971.120−0.5230.4191.644−2.2820.135
OSACumulative Percent Variance: 90%0.1690.9650.9850.3290.8690.9360.5390.6470.807
BKELMRBF kernel; Kernel parameter: γ : 4.6; Regularization coefficient: 1000.2000.9510.9770.2920.8960.9520.4900.7080.846
MLPHidden layer: 8, 4; Learning rate: 0.0010.1630.9670.9800.3480.8550.9250.5740.6760.822
GRUHidden layer: 32; Max epoch: 250; Learning rate: 0.0010.1360.9750.9870.2780.9040.9540.3770.8280.915
LSTMHidden layer: 32; Max epoch: 250; Learning rate: 0.0010.1280.9810.9910.3000.8910.9450.5440.6420.838
Cw-LS-SVM [17]Wavelet: Daubechies 4; Decomposition layer: 7; RBF kernel; Kernel parameter: γ : 100.4980.6990.8590.5560.6250.8490.8000.2230.703
w-PCA-ML-ELM [18]Wavelet: Daubechies 4; Decomposition layer: 3; Hidden layer: 300, 200, 150; PCA output dimension: 700.1970.9530.9780.3320.8630.9290.5010.6890.835
ALD-KOS-ELMF [30]RBF kernel; Kernel parameter: γ : 4.6; Regularization coefficient: 100; Sliding window width: 2500.1890.9570.9820.4860.7130.8640.7050.3960.711
CM-WNN [19]WNN wavelet: Morlet; Hidden layer: 14; Learning rate: 0.0010.2490.9250.9650.5110.6820.8440.8090.2030.523
VMD-PSO-BP [20]VMD decomposition modes: 4; Bandwidth limit: 7000; Noise tolerance: 0.3; Hidden layer: 9, 190.1450.9710.9830.3890.7740.8830.6370.4410.662
DOurs w/o spike separationWavelet: Daubechies 4; Decomposition layer: 9; R t h r e 2 : 0.95; WNN wavelet: Morlet; Hidden layer: 11; Learning rate: 0.0010.2490.9250.9630.3690.8340.9140.5200.6720.823
Ours w/o WNN0.2280.9370.9680.2570.9200.9590.3500.8510.923
Ours w/o OSA0.4680.7160.8870.5330.6470.8810.6080.5370.846
Ours w/o delay analysis0.2150.9440.9750.2710.9110.9560.3640.8310.921
Ours0.1830.9590.9800.2330.9340.9680.2920.8970.948
Table 5. Training time and inference time of different models.
Table 5. Training time and inference time of different models.
ModelsTraining Time (s)Inference Time (ms)
PLS34.890.24
KPLS10.819.62
LS-SVM2.04100.13
OSA0.272.43
KELM70.9345.97
MLP1.1480.16
GRU56.103877.19
LSTM89.243825.18
w-LS-SVM [17]16.80831.20
w-PCA-ML-ELM [18]0.5119.26
ALD-KOS-ELMF [30]1048.7328.63
CM-WNN [19]6.2110.21
VMD-PSO-BP [20]65.39487.14
Ours30.7035.74
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Xu, Y.; Yang, C.; Lou, S. Permeability Index Modeling with Multiscale Time Delay Characteristics Excavation in Blast Furnace Ironmaking Process. Electronics 2025, 14, 4670. https://doi.org/10.3390/electronics14234670

AMA Style

Xu Y, Yang C, Lou S. Permeability Index Modeling with Multiscale Time Delay Characteristics Excavation in Blast Furnace Ironmaking Process. Electronics. 2025; 14(23):4670. https://doi.org/10.3390/electronics14234670

Chicago/Turabian Style

Xu, Yonghong, Chunjie Yang, and Siwei Lou. 2025. "Permeability Index Modeling with Multiscale Time Delay Characteristics Excavation in Blast Furnace Ironmaking Process" Electronics 14, no. 23: 4670. https://doi.org/10.3390/electronics14234670

APA Style

Xu, Y., Yang, C., & Lou, S. (2025). Permeability Index Modeling with Multiscale Time Delay Characteristics Excavation in Blast Furnace Ironmaking Process. Electronics, 14(23), 4670. https://doi.org/10.3390/electronics14234670

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop