Next Article in Journal
Influence of Cutting Parameters on Exit-Side Defects in Abrasive Waterjet Machining of UNS A92024 Aluminum Alloy
Previous Article in Journal
The Galvanic Corrosion Behavior of ZCuAl10Fe5Ni5 Coupled with SAF2507 Duplex Stainless Steel in Seawater
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Informer-Based Prediction of Mold Level Anomalies in Continuous Casting via Temporal and Frequency-Domain Features

1
School of Automation, University of Science and Technology Beijing, Institute of Industrial Internet, Beijing 100083, China
2
Shunde Graduate School, University of Science and Technology Beijing, Foshan 528399, China
3
School of Computer and Communication Engineering, Institute of Industrial Internet, University of Science and Technology Beijing, Beijing 100083, China
4
School of Engineering, Royal Melbourne Institute of Technology (RMIT University), 124 La Trobe Street, Melbourne, VIC 3000, Australia
5
School of Computing, Macquarie University, Sydney, NSW 2109, Australia
6
Department of Electrical and Electronic Engineering, The Hong Kong Polytechnic University, Hong Kong
*
Authors to whom correspondence should be addressed.
Metals 2026, 16(5), 474; https://doi.org/10.3390/met16050474
Submission received: 5 March 2026 / Revised: 29 March 2026 / Accepted: 11 April 2026 / Published: 27 April 2026
(This article belongs to the Section Computation and Simulation on Metals)

Abstract

The stability of mold level fluctuations (MLFs) is crucial for product quality and process efficiency in continuous casting. Abnormal mold level fluctuations, which are typically associated with multiple factors including stopper rod opening, casting speed, and mold width, are known to lead to slab quality defects. In this paper, an Informer-based prediction framework is proposed for the early detection of abnormal MLF. A threshold-based labeling method is developed to quantify the future likelihood and severity of anomalies across different time horizons. Considering the importance of frequency-domain features in mold level prediction, power spectral density (PSD) features are incorporated and smoothed using the exponential moving average (EMA) to enhance predictive performance. Through the integration of temporal and processed spectral features, early indicators of abnormality can be captured, and proactive warnings can be issued. The proposed architecture is validated using approximately 32.5 million data points from a real-world continuous casting process. This approach provides a robust and data-driven solution for predicting and diagnosing abnormal MLF events in continuous casting. Experimental results show that the mean ROC-AUC and PR-AUC reach 0.821 and 0.418, respectively.

1. Introduction

Continuous casting is a critical process in modern steel production, where molten steel is continuously transformed into semi-finished products with defined cross-sectional shapes. Within this process, the mold serves as a key region for initial solidification, not only facilitating rapid cooling and shell formation but also affecting the flow behavior and stability of the molten steel surface. Continuous casting is inherently a highly coupled multi-physics system involving the interaction of heat transfer, fluid flow, and solidification phenomena [1,2].
Among the various monitored parameters in continuous casting, mold level fluctuation (MLF) is widely recognized as a challenging issue. These fluctuations may cause slag entrainment and surface defects, thereby posing risks to process stability and product quality [3]. While minor mold level variations within a narrow control range are generally acceptable in industrial practice, abnormal mold level fluctuations (AMLFs) exceeding the tolerance band (typically ±5 mm in industrial settings) are often associated with increased risks of quality defects and operational instability [3,4,5,6,7].
The underlying causes of MLF are complex and can be broadly classified into two categories: low-frequency random fluctuations and clustered high-frequency fluctuations [8]. The former is primarily associated with variations in tundish weight, casting speed, mold width, and the intermittent shedding of clogging material from the submerged entry nozzle (SEN), whereas the latter is typically linked to clogging of the SEN side holes and is characterized by periodic abnormal peaks whose frequency tends to increase over the course of the casting process.
Such abnormal fluctuations have been reported to be related to issues such as slag entrainment, uneven initial shell growth, deterioration of lubrication conditions, surface defects, and, in extreme cases, breakout [3,4,5,7,9]. Prior studies have investigated AMLF mechanisms, highlighting their industrial relevance and suggesting the importance of active monitoring and control to support process stability and product quality [3,5,7,9,10]. In this work, the focus is placed on the prediction of AMLF rather than normal fluctuations, addressing the problem of practical relevance for continuous casting operations.
A comprehensive understanding of abnormal MLF is essential for guaranteeing the quality of cast slabs and maintaining production stability. A prevalent research method entails analyzing process data to evaluate the influence of diverse factors on these fluctuations. Key operational factors such as stopper rod opening, casting speed, and argon pressure are known to influence MLF. In addition, prior studies have explored the relationship between mold level and abnormal fluctuations, while others have applied machine learning techniques for mold level prediction [4,5,11,12].
However, there are still several limitations in the existing studies regarding the analysis of mold level fluctuations. Most of the approaches focus on predicting mold level values or identifying anomalies across different future time windows or evaluating their severity. Moreover, the existing models are predominantly based on time-domain information and have not sufficiently utilized the frequency-domain characteristics inherent in mold level signals, which limits their capacity to capture the underlying mechanisms of abnormal fluctuations.
To address these challenges, a comprehensive framework is proposed for predicting abnormal MLF in the continuous casting process. The framework integrates a novel anomaly labeling system with an Informer-based time-series prediction model enhanced by power spectral density (PSD) features. The proposed framework enables the identification of potential abnormal fluctuations in advance, providing additional information for monitoring and analysis in continuous casting processes. The main contributions are summarized as follows:
(1)
A novel abnormal MLF labeling system is proposed to quantify anomalies across diverse future time windows while considering both probability and severity.
(2)
An Informer-based time-series prediction framework with an enhanced data input mechanism is developed, where downsampled time-series data and PSD features are jointly utilized to capture both global temporal trends and frequency-domain characteristics, which is rarely considered in standard Informer-based approaches, leading to improved prediction performance.
(3)
Extensive experiments are conducted to validate the performance of the proposed framework in predicting abnormal MLF and providing interpretable insights into the root causes of these anomalies.
The remainder of this paper is organized as follows: Section 2 reviews related work on MLF prediction and causal analysis; Section 3 presents the proposed framework, including the anomaly labeling method and the Informer-based prediction model with PSD enhancement; Section 4 demonstrates the experimental setup, model evaluation, and results analysis.

2. Related Work

2.1. Causes of Abnormal MLF

Mold level fluctuation in continuous casting is a complex phenomenon influenced by the interaction of metallurgical properties, flow dynamics, process parameters, and equipment conditions, such as stopper rod position, tundish weight, mold width, immersion depth of the submerged entry nozzle (SEN), and argon gas flow rate. Extensive studies have been conducted to clarify the factors and mechanisms underlying mold level instability. Recent numerical simulations based on large eddy simulation combined with volume of fluid (LES–VOF) models have further revealed that mold level fluctuation is essentially a manifestation of transient free-surface instability driven by unsteady jet impingement and surface flow oscillation [13]. In addition, flow pattern transition and multiphase interactions have been identified as key physical origins of abnormal mold level behavior [6,14]. Moreover, steel grade has been shown to strongly influence mold level behavior. It has been indicated that Ti-bearing IF steel and other alloyed grades can exhibit distinct fluctuation patterns due to variations in flow characteristics and metallurgical properties [7].
The stopper rod position, which directly controls the inlet flow rate and jet stability, is shown to have a significant impact on mold level fluctuation. It has been shown that the feedback from the stopper rod exhibits an approximate delay of 0.1412 s relative to the mold level fluctuation, which leads to control-induced oscillations under certain operating conditions [15]. Moreover, it has been revealed that frequent or abrupt stopper rod adjustments enhance flow unsteadiness near the SEN, thereby intensifying both low-frequency global fluctuations and high-frequency local surface disturbances [15]. It has been confirmed that stopper rod movement significantly alters transient multiphase flow, demonstrating a direct coupling between flow unsteadiness and surface fluctuation [16]. It has been indicated that automatic control of the stopper rod is the parameter most closely related to mold level fluctuations [17].
The role of casting speed in mold level fluctuation is crucial, as it governs jet momentum and surface flow intensity. An increase in casting speed results in higher wave heights and greater instantaneous variation in mold level fluctuations. It has been demonstrated through LES–VOF simulations that higher casting speeds significantly strengthen surface velocity and amplify transient deformation of the steel–slag interface [13]. Additionally, it has been found that a linear correlation exists between the frequency of mold level fluctuations and the surface velocity magnitude [9], indicating a direct coupling between flow kinetic energy and free-surface oscillation. Variations in casting speed are also significantly correlated with low-frequency random mold level fluctuations in the crystallizer [15], especially when the flow field approaches a critical flow pattern transition regime [14,18].
Change in tundish weight is one of the primary causes of low-frequency random mold level fluctuations. Fluctuations in tundish weight are known to modify the ferrostatic pressure at the mold inlet, leading to long-period variations in inlet flow rate and jet penetration depth. It has been confirmed through industrial observations that fluctuations in the tundish weight contribute significantly to mold level instability and are considered a key factor affecting the observed low-frequency fluctuations [15].
An increase in mold width is known to intensify low-frequency random mold level fluctuations, which stands as one of the crucial causes of abnormal liquid level behavior [8]. It has been shown through physical and numerical simulations that a wider mold weakens surface recirculation strength and reduces surface flow velocity, resulting in poorer damping of surface waves and increased susceptibility to flow asymmetry [6]. Under these circumstances, the flow field is more likely to enter a transitional regime between stable flow patterns, further amplifying mold level fluctuation [18]. To optimize the flow field and minimize these effects, a coordinated adjustment of the argon flow rate and SEN immersion depth is required, particularly under wide-mold and high-speed casting conditions [9,18].
The critical role of argon gas flow in mold level fluctuation is widely recognized due to its influence on multiphase flow behavior. It has been shown through industrial investigations that the argon flow rate, typically in the range of approximately 3–7 L/min under practical casting conditions, is associated with significant variations in mold level fluctuations. At relatively low flow rates, local flow instability may be intensified by bubble-induced turbulence, resulting in exacerbated mold level fluctuations. In contrast, at higher flow rates within the operational range, a more stable mold level can be achieved, owing to the redistribution of jet momentum and enhanced damping effects [15]. However, larger bubble sizes resulted from a further increase in the argon flow rate. When the argon flow rate exceeds approximately 4 L/min, the molten steel flow pattern near the SEN is altered, by which local mold level fluctuations are intensified, and the risk of slag entrainment is increased [10].
Signal-based analyses have been widely employed to characterize mold level fluctuation behavior. It has been revealed by time–frequency analysis of industrial mold level signals that instantaneous abnormal mold level fluctuations are associated with specific frequency bands, which suggests distinct physical origins for different fluctuation modes [15]. Wavelet transform and time-smoothed PSD analyses have also been used to extract dominant fluctuation frequencies and characterize oscillation modes, by which quantitative descriptions of mold level dynamics are provided, as well as their correspondence to flow pattern transition, surface wave resonance, and multiphase interactions [19]. The influence of slag properties and mold structural flexibility on fluctuation damping and spectral characteristics has also been highlighted in recent studies [20,21].
In addition to the factors mentioned above, mold level fluctuations are also induced by steel grade and SEN clogging. During the slab casting of peritectic steels, severe level oscillations are prone to occur due to shell shrinkage associated with the peritectic reaction. Interaction occurs between these oscillations and bulging-induced pressure fluctuations below the mold, by which mold level instability is further amplified [22,23]. Uneven jet momentum distribution resulted from asymmetric clogging of the SEN, by which asymmetric mold level fluctuations and non-uniform slag distribution resulted [24]. During the clogging process, steel–slag interface fluctuation in the narrow-wall region is significantly enhanced by the changes in flow velocity and flow field structure, by which the risk of slag entrainment and surface defects is increased [25].

2.2. Prediction Technologies for Continuous Casting Mold

Mold level fluctuation (MLF) is a critical factor affecting slab quality and casting stability in continuous casting. A variety of data-driven prediction methods have been developed in recent years, encompassing traditional machine learning, deep learning architectures, hybrid time–frequency approaches, and anomaly detection frameworks [4,5,11,12,26].
Most existing prediction models are developed under specific assumptions, such as relatively stable operating conditions, sufficient historical data, and implicit stationarity of time-series signals. These assumptions may not be valid in real industrial continuous casting, in which strong nonlinearity, multiphase interactions, and frequent process disturbances are involved. Consequently, the applicability of these models to bulk steel production is limited [4,27,28].
Early approaches focused on direct mold level prediction using feedforward neural network- and convolutional neural network (CNN)-based models. For example, GA-CNN has been applied to extract local temporal features from mold level signals, and improved prediction accuracy was demonstrated under controlled conditions [11,27]. Ensemble methods, such as random forest (RF) and GA-RF models, have also been explored to classify or predict periodic mold level fluctuations by using process variables including casting speed, roller diameter, roller spacing, and chemical composition [5,29]. Casting speed and other operational parameters were employed by Meng et al. [4,30] in RF and bidirectional LSTM models to predict abnormal mold level behavior, and it was shown that temporal dependencies can be better captured by recurrent architectures.
Recurrent neural networks, particularly LSTM-based models, have been widely adopted due to their capability to learn long-term dependencies in mold level time series. Performance has been approved by bidirectional LSTM and hybrid CNN–LSTM frameworks under complex, non-stationary conditions [4,28]. Informer, a transformer-based model with ProbSparse self-attention, has recently been applied to model long-term dependencies and capture multi-scale temporal features, and superior prediction accuracy and interpretability have been demonstrated [17,31]. It is further confirmed by large-scale industrial datasets that periodic and quasi-periodic mold level oscillations can be tracked by these deep recurrent models [5,28,30].
Signal decomposition and frequency-domain approaches have been introduced to enhance feature representation. Techniques such as EMD–SVR, VMD–SVR, wavelet transform, and time-smoothed PSD have been used to denoise signals, extract dominant fluctuation frequencies, and link oscillatory behavior to physical mechanisms like flow pattern transitions, surface wave resonance, and multiphase interactions [8,19,30,32]. Ti-bearing IF steel data were analyzed by Wang et al. [8] using discrete wavelet transform, and key operating conditions responsible for abnormal fluctuations were identified. Time–frequency characteristics were further explored by Meng et al. [30] to reveal correlations between specific frequency bands and abnormal events, and these features were combined with deep learning models by Zhang et al. [17,19] for improved predictive performance.
Despite these advancements, prediction of future mold level values is the primary focus of most methods, which fail to achieve explicit quantification of abnormal fluctuations in terms of intensity, frequency, or severity [30,33]. Deviations from normal operation can be identified using detection approaches based on reconstruction or prediction errors [33,34,35]; however, only qualitative or binary outputs are typically provided rather than quantitative evaluation. Moreover, the variability, noise, and dynamic disturbances present in industrial-scale casting are often not fully captured by laboratory-scale, simulation-based, or historical dataset validation, by which the practical utility of these models is limited [27,36,37,38,39,40].
The integration of mechanistic knowledge, data-driven modeling, and control optimization has been attempted in recent work. Physics-based simulation has been combined with machine learning by Zong et al. [37] and Norrena et al. [38] to predict defects and abnormal events, while reinforcement learning has been applied for automatic casting control [40]. However, these approaches are often reliant on mold level signals or control rewards without quantitatively modeling the magnitude of abnormal fluctuations, by which interpretability and early-warning capability are restricted.
In summary, although substantial progress has been made in mold level prediction and anomaly detection, several limitations are still present. First, using most methods, mold level trends are forecasted without explicitly quantifying abnormal fluctuation magnitude or risk. Second, integration of time-domain and frequency-domain features is often insufficient. Third, qualitative rather than quantitative results are usually provided by anomaly detection methods. To address these challenges, an integrated framework is developed in this study for predicting abnormal mold level fluctuations, which combines a novel anomaly labeling strategy with an Informer-based model enhanced by power spectral density features, and comparison is made against representative time-series learning models including InceptionTime [41], MultiRocket-Hydra [42], Time Series Transformer (TST) [43], and Informer [31].

3. Materials and Methods

The overall framework for abnormal MLF (AMLF) prediction is illustrated in Figure 1, which consists of four main components. Initially, the raw data are standardized to a uniform sampling frequency through linear interpolation, by which temporal alignment across all variables is ensured and consistent input for subsequent analysis is provided. Subsequently, feature construction is performed, including time-smoothed PSD calculation and input preparation. Specifically, the PSD of the stopper rod opening and liquid level signals is computed to extract stable frequency-domain features. Then, AMLF labels are generated and masking is applied. The anticipated probability and potential severity of abnormal MLF within a predefined future time window are quantified by the labeling scheme, and supervision labels for model training are formed. Abnormalities caused by tundish changes or liquid level fluctuations due to speed adjustments are removed by the masking mechanism, and, consequently, only machine-performance-related events during normal operating periods are predicted by the model. Finally, the processed time series and downsampled PSD features are taken as input by an Informer-based time series deep learning model to directly predict AMLF labels.

3.1. Data Processing

Continuous casting process data are composed of multiple process variables with heterogeneous and inconsistent sampling frequencies, including stopper rod opening, casting speed, tundish weight, mold width, argon gas pressure, and argon gas flow. In the raw dataset, the highest sampling frequency among all variables is observed to be 1 Hz, while the remaining variables are recorded at lower time intervals, and temporal misalignment across different data streams is, therefore, introduced.
To ensure consistency for subsequent time-series modeling, a unified time axis with a fixed interval of 1 s is constructed based on the highest available sampling frequency in the dataset. Specifically, a global time series is first defined, and all variables are aligned to this unified time axis. For missing observations at target time steps, linear interpolation between adjacent time points is applied to estimate the corresponding values, and temporal alignment across multi-source variables is thereby achieved. The interpolated continuous casting data are represented as X R T × D :
X = ( X 1 , 1 X 1 , 2 X 1 , D X 2 , 1 X 2 , 2 X 2 , D X T , 1 X T , 2 X T , D )
where X R T × D , T is defined as the temporal length of the data, and D is defined as the number of feature dimensions.
The proposed method is designed as a general time-series framework, and its pipeline can be transferred across different steel plants. Heterogeneous industrial data are converted into a unified temporal representation, and the method can be applied to different production environments while reproducibility is maintained.

3.2. Abnormal MLF Labels

A set of abnormal MLF labels is defined to characterize both the current state and the potential future progression of abnormal MLF. These labels are constructed based on future mold level variations within a predefined time window. The resulting time series of abnormal events is defined as
A M L F k ( t ) = { 1 i f   ( t [ t 30 , t ] ,   C a s t i n g S t a t e ( t ) = 1 | L e v e l ( t ) L e v e l ( t ) | > k , ) 0 e l s e
where L e v e l ( t ) is defined as the mold level at time t . An abnormal MLF is considered to occur when the difference between L e v e l ( t ) and any previous level within a 30 s window exceeds a threshold k . CastingState ( t ) is used as a condition to ensure that only abnormal fluctuations during normal production are considered. This mask is not generated by the proposed algorithm but is directly derived from the equipment’s operational status signal. Specifically, an abnormal MLF is counted only when CastingState ( t ) is 1, indicating that the caster is in normal production mode. When CastingState ( t ) is 0, corresponding to periods of downtime such as during ladle changes or abnormal casting speeds, the fluctuation is not considered an abnormal MLF.
Additionally, in the subsequent model performance evaluation, moments where C a s t i n g S t a t e ( t ) = 0 are excluded to avoid any impact on the results. As a result, only periods corresponding to normal production are included in the evaluation. However, during the training phase, data corresponding to moments when CastingState(t) = 0 are retained to preserve the completeness of the input data and to enable the model to learn from various operating conditions.
In the study of abnormal MLF, the process characteristics preceding the occurrence of an abnormal event must be considered. To this end, the original time series of abnormal MLF events is extended backward, and a new labeling sequence is constructed. In this sequence, the exact time points at which abnormal MLF events occur are marked, and the time period preceding each event is also labeled as positive, indicating a forthcoming abnormal event. Specifically, if an abnormal MLF event, as defined above, is observed within the time interval [ t , t + Δ t ] , where Δ t is defined as a user-defined time window, then the time instant t is labeled as a positive sample, defined as follows:
A M L F k , Δ t ( t ) = { 1 i f   t [ t , t + Δ t ] ,   A M L F k ( t ) = 1 0 e l s e
where t is defined as the current time instant, Δ t is defined as the user-defined prediction window, and A M L F k , Δ t ( t ) is defined as a binary label indicating whether an abnormal event is forthcoming.
A higher threshold value k corresponds to a more intense mold level fluctuation anomaly. Similarly, a larger threshold Δ t , corresponds to a broader extended time range. A set of different k and Δ t values is selected to capture abnormal MLF with varying intensities and time periods. A set of time-extended MLF anomaly labels is obtained through the variation in threshold values.
L a b e l s A M L F ( t ) = { A M L F k , Δ t ( t ) | k S e t k ,   Δ t S e t Δ t }
where Set k and Set Δ t are defined as user-selected sets of threshold values, which are used for the quantification of fluctuation intensity and the time-extension range, respectively, and L a b e l s A M L F ( t ) are defined as the set of time-extended MLF anomaly labels corresponding to time instant t .
A comprehensive index is defined to characterize the overall trend of abnormal MLF. The abnormal MLF index I n d e x A M L F ( t ) is obtained by averaging all elements in L a b e l s A M L F ( t ) , which is used for the comprehensive representation of both the intensity and the delay of future MLF anomalies at time t .
I n d e x A M L F ( t ) = 1 | S e t k | | S e t Δ t | k , Δ t L a b e l s A M L F ( t )
where | Set k | and | Set Δ t | are defined as the number of elements in Set k and Set Δ t respectively.
As shown in Algorithm 1, the abnormal MLF index is calculated through scanning historical mold level data and the aggregation of binary anomaly indicators under multiple thresholds and time extensions.
Algorithm 1 Abnormal MLF Index Calculation
Input: mold level data L R T , threshold set S e t k , time window set S e t Δ t
Output: Abnormal MLF Index I n d e x A M L F ( t ) for each time point t
for  t from t to T do
for  k in S e t k do
   Initialize A M L F k ( t ) as 0
for  t from t 30 to t  do
   if | L ( t ) L ( t ) | > k  then
       A M L F k ( t ) 1
      break the inner loop Initialize
for  t from t to T  do
for  k in S e t k  do
   for  t in S e t t  do
Initialize A M L F k , Δ t ( t ) as 0
for  t form t to t + t  do
   if A M L F k ( t ) = 1 then
       A M L F k , Δ t ( t ) 1
      break the inner loop
L a b e l s A M L F ( t ) = { A M L F k , Δ t ( t ) | k S e t k , Δ t S e t Δ t }
I n d e x A M L F ( t ) = 1 | S e t k | | S e t Δ t | k , Δ t L a b e l s A M L F ( t )
return  L a b e l s A M L F ( t ) and I n d e x A M L F for all t

3.3. Informer-Based Prediction Framework with PSD

To enable early warning of mold level abnormalities, a deep learning model is proposed for the prediction of abnormal MLF labels using multivariate continuous casting data. In this section, the data preprocessing steps, the model architecture, and the integration of PSD features into the model are presented, highlighting the enhancement of the model’s ability to capture fine-grained fluctuations.

3.3.1. Mean Downsampling over Sliding Windows

The raw continuous casting data are sampled at a frequency of 1 Hz. However, using the full sequence as input to a time-series model would lead to excessively long sequences which are challenging for the model to handle. To reduce data volume while retaining trend information, mean downsampling is applied. The raw time series is denoted as
Y = [ y 1 , y 2 , y i , y N ] ,   Y R N × D ,     y i R D
where D is defined as the dimensionality of the multivariate observation at each time step, and N is defined as the total length of the sequence Y .
Given a sampling interval of step , the original sequence Y is downsampled using a mean pooling operation. The resulting sequence S ( Y ) is of length W , which is determined by the sampling interval step . The downsampling process is defined as
S ( Y ) = [ y 1 , y 2 , y W ] ,         y i = 1 s t e p j = ( i 1 ) s t e p + i i s t e p y j ,               i = 1 , 2 , W ,     W = N s t e p
where y j is defined as the original multivariate observation at time step j, step is defined as the downsampling step, S ( Y ) is defined as the resulting downsampled sequence, and W is defined as the length of the downsampled sequence.
At each time step, a sliding window covering the interval from ( t step × W ) to t is extracted and downsampled, resulting in the formation of the model input.
W i n d o w ( t ) = X t s t e p × W : t I n p u t t e m p o r a l ( t ) = S ( W i n d o w ( t ) )
where W i n d o w ( t ) is defined as the sliding window extracted from the original sequence from ( t step × W ) to t , and I n p u t t e m p o r a l ( t ) is defined as the temporal portion of the model input.

3.3.2. Incorporating PSD Features

While micro trends are preserved during downsampling, high-frequency micro fluctuations, which are critical for early anomaly detection, are suppressed. Among the multivariate parameters used in this study, pronounced fluctuations are observed at the mold level and the stopper rod opening. Therefore, PSD-based frequency-domain features are subsequently extracted from both the mold level and the stopper rod opening, for the complement of the time-domain representation.
Given a one-dimensional signal x R T , the PSD is computed using the periodogram method, in which the distribution of signal power over frequency is estimated. For a signal segment of length W p s d , the periodogram is defined as the squared magnitude of the discrete Fourier transform (DFT):
P ( t , ω k ) = 1 W p s d | n = 1 W p s d x t W p s d + n · e j 2 π ω k n / W p s d | 2 , ω k = k W p s d , k = 0 , 1 , , W p s d 2
where ω k is defined as the discrete frequency components (normalized frequency), and P ( t , ω k ) is defined as the estimated power at frequency ω k , which is obtained from the sliding window over the interval [ t W p s d + 1 , t ] .
The PSD vector at time t is obtained by applying the periodogram to a sliding window ending at t :
p s d ( t ) = [ P ( t , ω 0 ) , P ( t , ω 1 ) , , P ( t , ω W p s d / 2 ) ]
where p s d ( t ) denotes the vector formed at time step t by evaluating P ( t , ω k ) for all values of ω k .
Significant differences are exhibited by the PSD values computed at adjacent time points due to the finite window effect, and, consequently, high-frequency fluctuations are induced in the PSD curve. Noise is introduced into the model input by this instability, which is detrimental to anomaly prediction. To reduce these fluctuations and improve the reliability of the PSD features, an exponential moving average (EMA) is applied to smooth the frequency-domain signal and to reflect the underlying trend more accurately.
p s d ¯ ( t ) = k e m a · p s d ¯ ( t 1 ) + ( 1 k e m a ) · p s d ( t )
where k e m a is defined as the smoothing factor, by which the degree of smoothing is determined and can be adjusted according to the desired smoothing effect. p s d ¯ is defined as the PSD values after exponential moving average (EMA) smoothing is applied.
To align with the temporal input, PSD features are sampled at the same time points as the downsampled window:
t i m e p o i n t s = { t W × s t e p + i · s t e p     i [ 1 , W ] } , I n p u t p s d ( t ) = { p s d ¯ ( t ) t t i m e p o i n t s }
where timepoints is defined as the sampling time points, and W and step are as defined in Section 3.3.1.

3.3.3. Model Inputs, Outputs, and Evaluation Metrics

Two types of inputs are taken by the model: the mean downsampling over sliding windows data and the smoothed PSD features. The raw process data, denoted as I n p u t t e m p o r a l ( t ) R W × D , is composed of downsampled temporal information, where W is the window size and D is the number of features at each time step. The smoothed PSD features are contained in the second input I n p u t p s d ( t ) R W × W p s d 2 , and W p s d is defined as the number of frequency bins in the PSD.
These two inputs are concatenated along the feature dimension, by which a combined input of dimension I n p u t ( t ) R W × ( D + W p s d 2 ) is generated, which is then fed into the model for prediction.
The predictive target of the model is the set of time-extended MLF anomaly labels L a b e l s A M L F ( t ) defined in Section 3.2.
T a r g e t ( t ) = L a b e l s A M L F ( t )
Binary cross-entropy (BCE) is used as the loss function:
O u t p u t ( t ) & = M o d e l ( I n p u t ( t ) )
l o s s & = B C E ( T a r g e t ( t ) , O u t p u t ( t ) ) T a r g e t ( t ) R | S e t k | | S e t Δ t | ,   O u t p u t ( t ) R | S e t k | | S e t Δ t |
where | S e t k | and | S e t Δ t | are defined as the number of elements in the S e t k and S e t Δ t .
The combined features I n p u t ( t ) are employed in the training process, by which the prediction of abnormal MLF events is performed, and the loss is calculated using binary cross-entropy (BCE). The parameters of the model are updated iteratively, by which the loss is reduced, and, consequently, a model is obtained that can predict abnormal MLF events in future production cycles.
Since the expected outputs of the model are binary classification labels (0 or 1), the Receiver Operating Characteristic Area Under the Curve (ROC-AUC) and the Precision–Recall Area Under the Curve (PR-AUC) are adopted as the primary evaluation metrics in this paper. ROC-AUC is used to measure the overall classification ability of the model, while PR-AUC is more sensitive to class imbalance, and predictive performance in the positive class is better reflected, by which it is made more practical for imbalanced classification tasks [44]. Given that multiple output dimensions are involved in each sample, the final evaluation results of mean ROC-AUC and mean PR-AUC are obtained by averaging the ROC-AUC and PR-AUC scores across all dimensions.
m e a n   R O C - A U C = 1 | S e t k | | S e t Δ t | k , Δ t AUC ROC ( T a r g e t 1 : T , k , Δ t , ,   O u t p u t 1 : T , k , Δ t )
m e a n   P R - A U C = 1 | S e t k | | S e t Δ t | k , Δ t AUC PR ( T a r g e t 1 : T , k , Δ t , ,   O u t p u t 1 : T , k , Δ t ) T a r g e t R T | S e t k | | S e t Δ t | ,   O u t p u t R T | S e t k | | S e t Δ t |
where T is the sequence length, S e t k and S e t Δ t are the sets of thresholds and time windows, T a r g e t 1 : T , k , Δ t and O u t p u t 1 : T , k , Δ t are the corresponding ground-truth and predicted sequences, and the subscript 1 : T is used to indicate the time steps from 1 to T . The range of m e a n   ROC-AUC and m e a n   PR-AUC is from 0 to 1, and better model performance is indicated by higher values. In the subsequent experiments, the performance of different models and parameter configurations is evaluated by mean ROC-AUC and mean PR-AUC. Better performance is considered to be exhibited by a configuration when higher values on both mean ROC-AUC and mean PR-AUC are achieved compared with other configurations, by which superior overall discriminative ability and improved performance on positive-class detection are indicated. In addition, for a more comprehensive evaluation, other evaluation metrics, such as precision, recall, accuracy and F1-score, are also used in the following sections, with results reported for each dimension individually.

4. Experiment

In this experiment, the dataset is collected from a continuous twin-stream caster over a period of 188 days. The sampling frequency is standardized to one data point per second, by which approximately 32.5 million data points are obtained. The dataset is divided into training, validation, and test sets at an 8:1:1 ratio to be used for model development and evaluation. Given the large volume of data, a subset corresponding to approximately one five-hundredth of the total dataset is used in each epoch. This subset is constructed using a sliding window scheme with fixed window parameters, where the window is advanced along the time axis using a stride of 500 plus a randomized offset across each epoch. This design is adopted to ensure that different segments of the time series are incorporated into different epochs, by which temporal continuity within each sample is preserved. Additionally, to address the class imbalance issue and ensure more balanced representation of outputs, the outputs are resampled using inverse frequency weighting, where the weights are the square roots of the inverse frequencies of the different output classes. The experimental results show that convergence of the loss function is achieved within the first 10 training rounds, and, consequently, the early stopping criterion is satisfied. Therefore, the number of training rounds is set to 10.

4.1. Abnormal MLF Index

The procedure for constructing the abnormal fluctuation index is illustrated in Figure 2. Starting from the original liquid level signal shown in Figure 2a, abnormal fluctuations are extracted according to Equation (2), as presented in Figure 2b. The extracted fluctuations are then extended over time using Equation (3), by which the temporal representation shown in Figure 2c is obtained. Based on this representation, the I n d e x A M L F is calculated using Equations (4) and (5), as shown in Figure 2d.
As indicated by Figure 2d, quantitative characterization of the temporal behavior of abnormal level fluctuations is performed by the I n d e x A M L F . Distinct index responses are exhibited by different time periods, by which variations in both fluctuation intensity and occurrence frequency are reflected. Higher index values correspond to periods with strong or frequent abnormal fluctuations, whereas lower values result from weaker or sporadic fluctuations.

4.2. Model and Parameter Optimization Experiment

In the previous section, the abnormal MLF index is calculated using a limited number of thresholds. This simplification is adopted to enhance the clarity of the illustration in Figure 2, as using an excessive number of thresholds would result in a cluttered and less interpretable visualization. In subsequent studies, the thresholds k and Δ t are configured with finer granularity: S e t k { 6 , 7 , 8 , 9 , 10 , 11 , 12 } (mm) and S e t Δ t { 60 , 120 , 180 , 300 , 600 , 900 } (seconds). With the increased number of thresholds, more subtle and diverse level fluctuations can be captured, providing a more comprehensive representation of the MLF.
The data features used in this experiment include the stopper rod opening, the smoothed PSD of the stopper rod opening, casting speed, casting width, tundish weight, argon flow rate, argon pressure, and smoothed PSD of stopper rod opening. The features are collectively referred to as F = { s t o p p e r ,   s p e e d ,   w i d t h ,   w e i g h t ,   f l o w ,   p r e s s u r e ,   P S D s t o p p e r } . The liquid level value is not used as a feature in this section so that the impact of factors other than liquid level fluctuations can be specifically learned by the model. Additionally, the dataset does not include temperature-related measurements; therefore, temperature is not considered in this study.
Since the expected outputs of the model are binary classification labels (0 or 1), the ROC-AUC and the PR-AUC are adopted as the primary evaluation metrics. Several models are evaluated, including Informer, HydraMultiRocket, InceptionTime, and TST. To identify the optimal parameter configurations and models, various sizes of the sliding window W { 60 , 90 , 120 , 180 } and the mean downsampling intervals s t e p { 10 , 20 , 30 } are tested.
Figure 3 shows the comparison of the Inception, HydraMultiRocket, TST, and Informer models. Figure 3a involves mean ROC-AUC across various parameter configurations, with the horizontal axis representing different configurations of W and step, where the part before the underscore represents W and the part after the underscore represents step. Figure 3b presents the comparison of the models in terms of mean PR-AUC, with the same parameter configurations as in Figure 3a. Specifically, the highest performance is achieved by the Informer model, with a mean ROC-AUC of 0.783 and a mean PR-AUC of 0.365, when the sliding window size W is 120 and the downsampling interval s t e p is 30. Across experiments with all models, the best performance is observed with configurations where W { 120 , 180 } and s t e p { 20 , 30 } . Based on the consistently superior performance of the Informer model in multiple configurations, the Informer is selected for further experiments.
As shown in Table 1, the best evaluation metrics of various models under different parameter configurations indicate that the highest performance is achieved by the Informer-based model. Specifically, a mean ROC-AUC of 0.783 and a mean PR-AUC of 0.365 is obtained by the Informer-based model, and it is observed to outperform Inception with mean ROC-AUC of 0.755 and mean PR-AUC of 0.328, HydraMultiRocket with mean ROC-AUC of 0.712 and mean PR-AUC of 0.282, and TST with mean ROC-AUC of 0.762 and mean PR-AUC of 0.344.
Next, we test the performance of the model with different PSD configurations by varying the smoothing factor k e m a and sliding window length w p s d . It is found that the configuration with k e m a = 0.995 and a sliding window length of 300 produces the best results. To establish a baseline, we set k e m a = 0, which corresponds to the scenario where no smoothing is applied. The results show that smoothing ( k e m a > 0) improves the performance of the model compared to the baseline k e m a = 0. It is suggested that smoothing helps reduce noise in the data, allowing the model to better capture important patterns and enhance its predictive accuracy. These improvements are demonstrated in Figure 4, where the model performance is compared across different k e m a and w p s d . Figure 4a, b represent the mean AUC-ROC and mean PR-AUC of different experiments, respectively. Additionally, diminishing returns are observed with longer sliding window lengths w p s d , where performance improvements become limited while computational cost increases. Therefore, w p s d of 300 is selected as the optimal choice, balancing performance and computational efficiency.

4.3. Feature Ablation Experiment

Based on previous experiments, a feature ablation study is conducted to assess the impact of different combinations of input characteristics on the performance of the Informer model. Specifically, the study is aimed at evaluating the importance of various input features. All experiments are conducted using the optimal parameter configuration identified previously, with a sliding window size W = 120, a downsampling interval step = 30, k e m a = 0.995 and a PSD window size of 300.
Figure 5 shows mean ROC-AUC and mean PR-AUC of the model under different feature combinations. F represents the feature combination used in the previous experiment as the baseline. Starting from F, one feature is systematically removed at a time to evaluate its impact on model performance. The minus sign on the x-axis represents the removal of a feature, while the plus sign represents the addition of a feature. Figure 5a, b represent the mean AUC-ROC and mean PR-AUC of different experiments, respectively.
The most significant drop in performance is observed when the PSD of the stopper is removed, and its critical role in the model is demonstrated. The incorporation of the PSD of the stopper is demonstrated to greatly enhance the ability of the model to detect abnormal liquid level fluctuations, and it is, therefore, identified as a key feature for accurate predictions.
Next, the temporal features of pressure, speed, flow, width, and stopper are removed, which is observed to cause a slight decrease in model performance, highlighting that these features also contribute to the predictive ability of the model, although their impact is less pronounced compared to the PSD of the stopper.
Additionally, two further experiments are conducted. The first experiment is conducted using only the liquid level value and its PSD to predict abnormal MLF labels. This model is observed to perform better than the baseline, indicating that the combination of liquid level value and its PSD still provides valuable predictive power, even without the additional features from F.
The second experiment is conducted using the liquid level value, its PSD, and the full feature combination F as inputs to the model. This configuration is observed to yield the best performance, with a mean ROC-AUC of 0.821 and a mean PR-AUC of 0.418 being achieved. It is demonstrated that the use of the combined feature set leads to better performance than the use of only the liquid level value. The model with this feature combination can be utilized as an early-warning system for liquid level anomalies.

4.4. Performance Analysis of the Early-Warning Model

In the previous analysis, mean ROC-AUC and mean PR-AUC values were used as evaluation metrics. Now, the analysis is focused on the best-performing model, which is observed to achieve a mean ROC-AUC of 0.821 and a mean PR-AUC of 0.418. The output of this model is a 6 × 7 matrix, where S e t k { 6 , 7 , 8 , 9 , 10 , 11 , 12 } (mm) and S e t Δ t { 60 , 120 , 180 , 300 , 600 , 900 } (seconds). Each position in the matrix represents an AMLF label with values of either 0 or 1. The ROC and PR curves for each dimension are displayed in Figure 6, with 42 curves in each plot representing performance at different thresholds. Upon analysis, the ROC curve values are observed to be concentrated and consistently above the 0.5 random line, while the PR curve is observed to show significant variation, primarily due to the changes in the proportions of 0 and 1 labels at different thresholds. This model is selected as the early-warning system.
As shown in Figure 7, the evaluation metrics for each matrix position include ROC-AUC, PR-AUC, the proportion of labels equal to 1, accuracy, precision, recall, and F1 score. It is observed that as the amplitude decreases and the time range increases, the proportion of label 1 increases, and the number of label 0 and label 1 become more balanced. Accordingly, ROC-AUC, PR-AUC, precision, recall and F1 score are higher. For example, when k is 6 and Δ t is 900, ROC-AUC of 0.817, PR-AUC of 0.937, precision of 0.827, recall of 0.961, and F1 score of 0.889 are obtained. On the other hand, at the top-right corner of the matrix, e.g., k is 12 and Δ t is 60, where the amplitude is large and the time range is small, the proportion of label 1 is very low, and lower metric values are observed. Overall, these results indicate that the model effectively distinguishes between different labels, with better performance in configurations where anomalies and normal cases are more balanced. Performance is naturally lower in configurations with fewer anomalies due to the imbalanced distribution of labels.
Figure 8 presents a comparison between the target values and the output predictions. As shown in the plot, the two curves are closely aligned, which indicates that the model is demonstrated to be able to accurately predict anomalies. The alignment of the prediction curve with the target curve demonstrates that the model is able to identify anomalies, thus making it suitable for use as an early-warning system.

5. Conclusions

This study proposes an abnormal mold level fluctuation (MLF) prediction framework for a continuous casting process by integrating an Informer-based time-series model with power spectral density (PSD) features and a novel anomaly labeling strategy. The main conclusions are summarized as follows: The proposed Informer-based time-series framework is demonstrated to exhibit strong predictive capability for abnormal MLF events in complex industrial environments. Even without using mold level measurements as input, the model is observed to achieve a mean ROC-AUC of 0.783 and a mean PR-AUC of 0.365, and traditional baseline methods are outperformed. It is indicated that the proposed feature design and modeling strategy effectively captures the underlying dynamics of the continuous casting process. The parameter analysis indicates that model performance is sensitive to the configuration of temporal and spectral inputs. The optimal setting is obtained when the sliding window size is 120, the downsampling interval is 30, and the PSD window size is 300, under which the model is observed to achieve the best predictive performance. It is confirmed that jointly modeling multi-scale temporal and frequency-domain information is important. The feature contribution analysis reveals that stopper rod opening and its corresponding PSD features are the most influential factors associated with abnormal MLF. In contrast, variables such as casting speed, mold width, and tundish weight are observed to exhibit relatively weaker correlations. Interpretable insights are provided by this finding into the physical mechanisms driving mold level instability. The incorporation of mold level measurements as an additional feature is demonstrated to further improve prediction performance, with the mean ROC-AUC increased to 0.821 and the mean PR-AUC increased to 0.418, with the best-performing dimension achieving a ROC-AUC of 0.817 and a PR-AUC of 0.937 when k is 6 and Δt is 900. More accurate predictions of future abnormal MLF events are enabled by this improvement, thus making the model suitable for early-warning applications.
Future work will focus on the incorporation of additional metallurgical variables, such as steel composition and thermal parameters; model robustness is expected to be further improved and more precise intelligent control of continuous casting will thereby be enabled.

Author Contributions

Conceptualization: X.X., M.F., W.L., and H.W.; methodology: X.X., W.L., Q.W., and J.W.; software: X.X., M.F., and W.L.; validation: X.X., M.F., and H.W.; formal analysis: X.X., Y.L., and Z.W.; investigation: M.F., H.W., and Y.L.; resources: W.L., H.W., and Y.L.; data curation: Z.W., Y.L., Z.W., and Q.W.; writing—original draft preparation: X.X.; writing—review and editing: M.F., W.L., Q.W., Y.B.B., C.Y., and T.G.; visualization: X.X., M.F., W.L., Y.B.B., T.G., and C.Y.; supervision: M.F., W.L., H.W., and J.W.; project administration: M.F., W.L., H.W., and J.W.; funding acquisition: M.F., W.L., Q.W., and J.W. All authors have read and agreed to the published version of the manuscript.

Funding

This work is supported in part by the National Science and Technology Major Project (2025ZD1602303), the National Natural Science Foundation of China (U25A20433, 92567203, 42401521), the Joint Research Fund for Beijing Natural Science Foundation and Haidian Original Innovation (L232001), the Henan Key Research and Development Program (241111320700), the GuangDong Basic and Applied Basic Research Foundation (2024A1515011866, 2024A1515011480, 2025A1515011300), the Central Guidance on Local Science and Technology Development Fund of ShanXi Province (YDZJSX20231D005, YDZJSX2024B017), the Science and Technology Innovation Program of Xiongan New Area under Grant 2025XAGG0028, the National Key Research and Development Program of China 2023YFF0905903.

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding authors.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Thomas, B.G. Review on modeling and simulation of continuous casting. Steel Res. Int. 2018, 89, 1700312. [Google Scholar] [CrossRef]
  2. Patel, A.J.; Chaudhari, S.N. Recent advancements and critical insights in casting processes. Int. J. Sci. Res. Mech. Mater. Eng. 2023, 7, 5–18. [Google Scholar]
  3. Hibbeler, L.C.; Thomas, B.G. Mold slag entrainment mechanisms in continuous casting molds. Iron Steel Technol. 2013, 10, 121–136. [Google Scholar]
  4. Meng, X.; Luo, S.; Zhou, Y.; Wang, W.; Zhu, M. Control of instantaneous abnormal mold level fluctuation in slab continuous casting mold based on bidirectional long short-term memory model. Steel Res. Int. 2025, 96, 2400656. [Google Scholar] [CrossRef]
  5. Meng, X.; Luo, S.; Xi, X.; Zhou, Y.; Wang, W.; Zhu, M. Characterizing and controlling abnormal periodic mold level fluctuations in a commercial slab continuous caster using big data. Met. Mater. Trans. B 2024, 55, 4150–4162. [Google Scholar] [CrossRef]
  6. Wang, Z.; Wang, R.; Liu, J.; Yu, W.; Li, G.; Cui, H. Exploration of the causes of abnormal mold level fluctuation in thin slab continuous casting mold. J. Mater. Res. Technol. 2024, 33, 1460–1469. [Google Scholar] [CrossRef]
  7. Lei, H.; Liu, J.; Tang, G.; Zhang, H.; Jiang, Z.; Lv, P. Deep insight into mold level fluctuation during casting different steel grades. JOM 2023, 75, 914–919. [Google Scholar] [CrossRef]
  8. Wang, Z.; Shan, Q.; Cui, H.; Pan, H.; Lu, B.; Shi, X.; Wen, J. Characteristic analysis of mold level fluctuation during continuous casting of Ti-bearing IF steel. J. Mater. Res. Technol. 2024, 31, 1367–1378. [Google Scholar] [CrossRef]
  9. Cho, S.-M.; Thomas, B.G.; Kim, S.-H. Effect of nozzle port angle on transient flow and surface slag behavior during con-tinuous steel-slab casting. Met. Mater. Trans. B 2019, 50, 52–76. [Google Scholar] [CrossRef]
  10. Zheng, F.; Chen, W.; Zhang, L. Effect of mold oscillation on multiphase flow and slag entrainment in a slab continuous casting mold. Met. Mater. Trans. B 2024, 55, 3784–3797. [Google Scholar] [CrossRef]
  11. He, Y.; Zhou, H.; Zhang, B.; Guo, H.; Li, B.; Zhang, T.; Yang, K.; Li, Y. Prediction model of liquid level fluctuation in continuous casting mold based on GA-CNN. Met. Mater. Trans. B 2024, 55, 1414–1427. [Google Scholar] [CrossRef]
  12. Su, W.; Lei, Z.; Yang, L.; Hu, Q. Mold-level prediction for continuous casting using VMD–SVR. Metals 2019, 9, 458. [Google Scholar] [CrossRef]
  13. Tian, Y.; Zhou, H.; Wang, G.; Xu, L.; Qiu, S.; Zhu, R. Numerical modeling of transient flow characteristics on the top surface of a steel slab continuous casting strand using a large eddy simulation combined with volume of fluid model. Materials 2023, 16, 5665. [Google Scholar] [CrossRef]
  14. Rong, W.; Qiu, L.; Liu, Z.; Li, B.; Liu, C.; Meng, X.; Luo, S.; Ren, B.; Lv, G.; Zhou, Y.; et al. Numerical study on characteristic positions and transition of flow patterns in a slab continuous casting mold. Steel Res. Int. 2024, 95, 2300714. [Google Scholar] [CrossRef]
  15. Meng, X.; Luo, S.; Ren, B.; Lv, G.; Zhou, Y.; Wang, W.; Zhu, M. Effect of argon blowing on mold level in a commercial slab continuous caster. Met. Mater. Trans. B 2025, 56, 2411–2424. [Google Scholar] [CrossRef]
  16. Liu, R.; Thomas, B.G.; Sengupta, J.; Chung, S.D.; Trinh, M. Measurements of molten steel surface velocity and effect of stopper rod movement on transient multiphase fluid flow in continuous casting. ISIJ Int. 2014, 54, 2314–2323. [Google Scholar] [CrossRef]
  17. Cai, M.; Fu, M.; Li, W.; Wang, Q.; Chen, N.; Ma, Z.; Sun, L.; Zhang, R.; Wang, H.; Wang, J. The time–frequency analysis and prediction of mold level fluctuations in the continuous casting process. Metals 2025, 15, 1253. [Google Scholar] [CrossRef]
  18. Huang, C.; Zhou, H.; Zhang, L.; Yang, W.; Zhang, J.; Ren, Y.; Chen, W. Effect of casting parameters on the flow pattern in a steel continuous casting slab mold. Steel Res. Int. 2022, 93, 2100350. [Google Scholar] [CrossRef]
  19. Zhang, M.; Wang, Z.; Cui, H. Characteristic analysis of mold level fluctuation in continuous casting based on wavelet transform. Steel Res. Int. 2025, 96, 202500943. [Google Scholar] [CrossRef]
  20. Li, Q.; Kong, Q.; Wen, G.; Tang, P.; Hou, Z. Effects of mold flux on abnormal mold-level fluctuation during slab continuous casting. Met. Mater. Trans. B 2025, 57, 1125–1141. [Google Scholar] [CrossRef]
  21. Shi, J.-P.; Shang, X.-X.; Wang, Y.; Zhang, C.-J.; Zhu, L.-G. Fluctuation of steel–slag interface in flexible thin slab casting mold. J. Iron Steel Res. Int. 2025, 32, 1882–1900. [Google Scholar] [CrossRef]
  22. Wang, Z.; Yu, W.; Lu, Y.; Cui, H. Investigation on the effect of bulging on mold level fluctuation in continuous casting of peritectic steel. Met. Mater. Trans. B 2025, 56, 3634–3649. [Google Scholar] [CrossRef]
  23. Jiang, Z.-K.; Su, Z.-J.; Xu, C.-Q.; Chen, J.; He, J.-C. Abnormal mold level fluctuation during slab casting of peritectic steels. J. Iron Steel Res. Int. 2020, 27, 160–168. [Google Scholar] [CrossRef]
  24. Wang, Z.; Liu, J.; Cui, H.; Sun, H.; Wang, Y. Effect of SEN asymmetric clogging on mold level fluctuation and mold slag distribution during continuous casting. Met. Mater. Trans. B 2024, 55, 2932–2947. [Google Scholar] [CrossRef]
  25. Li, Y.; He, W.; Zhao, C.; Liu, J.; Yang, Z.; Zhao, Y.; Yang, J. Mathematical modeling of transient submerged entry nozzle clogging and its effect on flow field, bubble distribution and interface fluctuation in slab continuous casting mold. Metals 2024, 14, 742. [Google Scholar] [CrossRef]
  26. Sun, Y.; Liu, Z.; Xiong, Y.; Yang, J.; Xu, G.; Li, B. Machine learning applications in continuous casting: A review. Metals 2025, 15, 1383. [Google Scholar]
  27. Wang, R.; Li, H.; Guerra, F.; Cathcart, C.; Chattopadhyay, K. Predicting quantitative indices for SEN clogging in continuous casting using long short-term memory time-series model. ISIJ Int. 2022, 62, 2311–2318. [Google Scholar] [CrossRef]
  28. Yin, Z.; Fu, M.; Cai, M.; Xin, X.; Li, W.; Wang, J. The prediction of crystallizer liquid level fluctuations in continuous casting based on HECNN-LSTM model. In Information Processing and Network Provisioning; Kadoch, M., Cheriet, M., Qiu, X., Eds.; Springer Nature: Singapore, 2026; pp. 57–68. [Google Scholar]
  29. He, Y.; Zhang, B.; Zhou, H.; Zhang, T.; Wang, L.; Li, Y. Prediction method for liquid level fluctuation in continuous casting mold based on GA-RF. Contin. Cast. 2025, 44, 30–35. [Google Scholar]
  30. Meng, X.; Luo, S.; Zhou, Y.; Wang, W.; Zhu, M. Time–frequency characteristics and predictions of instantaneous abnormal level fluctuation in slab continuous casting mold. Met. Mater. Trans. B 2023, 54, 2426–2438. [Google Scholar] [CrossRef]
  31. Zhou, H.; Zhang, S.; Peng, J.; Zhang, S.; Li, J.; Xiong, H.; Zhang, W. Informer: Beyond efficient transformer for long sequence time-series forecasting. In Proceedings of the AAAI Conference on Artificial Intelligence, Online, 2–9 February 2021; pp. 11106–11115. [Google Scholar] [CrossRef]
  32. Su, Y.; Lei, Y. Hybrid EMD–SVR–GA algorithm for mold level prediction in continuous casting. Processes 2019, 7, 177. [Google Scholar]
  33. Wu, X.; Kang, H.; Yuan, S.; Jiang, W.; Gao, Q.; Mi, J. Anomaly detection of liquid level in mold during continuous casting by using forecasting and error generation. Appl. Sci. 2023, 13, 7457. [Google Scholar] [CrossRef]
  34. Wang, R.; Li, H.; Guerra, F.; Cathcart, C.; Chattopadhyay, K. Development of quantitative indices and machine learning-based predictive models for SEN clogging. In Proceedings of the AISTech 2021, Iron & Steel Technology Conference, Nashville, TN, USA, 29 June–1 July 2021; p. 1892. [Google Scholar]
  35. Diniz, A.P.M.; Ciarelli, P.M.; Salles, E.O.T.; Coco, K.F. Long short-term memory neural networks for clogging detection in the submerged entry nozzle. Decis. Making Appl. Manag. Eng. 2022, 5, 154–168. [Google Scholar] [CrossRef]
  36. Zhang, K.; Liu, J.; Cui, H.; Qian, X.; Deng, S. Effect of submerged entry nozzle (SEN) on fluid flow and surface fluctuation: Water model experimental study on SEN. J. Univ. Sci. Technol. Beijing (Eng. Sci.) 2018, 40, 638–645. [Google Scholar]
  37. Zong, N.; Jing, T.; Gebelin, J.-C. Machine learning techniques for the comprehensive analysis of the continuous casting processes: Slab defects. Ironmak. Steelmak. 2025, 52, 1–15. [Google Scholar] [CrossRef]
  38. Norrena, J.; Louhenkilpi, S.; Visuri, V.; Alatarvas, T.; Bogdanoff, A.; Fabritius, T. Phenomenological prediction of defect formation in continuous casting of steel utilizing fundamental modeling and machine learning. Steel Res. Int. 2025, 97, 2074–2088. [Google Scholar] [CrossRef]
  39. Gasparini, L.; Marko, L.; Landauer, J.; Kugi, A.; Fuchshumer, S.; Steinboeck, A. Optimal sensor placement for mold level in continuous casting. IFAC-PapersOnLine 2024, 58, 107–112. [Google Scholar] [CrossRef]
  40. Wu, X.; Jiang, W.; Yuan, S.; Kang, H.; Gao, Q.; Mi, J. Automatic casting control method of continuous casting based on im-proved soft actor–critic algorithm. Metals 2023, 13, 820. [Google Scholar] [CrossRef]
  41. Fawaz, H.I.; Lucas, B.; Forestier, G.; Pelletier, C.; Schmidt, D.F.; Weber, J.; Webb, G.I.; Idoumghar, L.; Muller, P.-A.; Petitjean, F. InceptionTime: Finding AlexNet for time series classification. Data Min. Knowl. Discov. 2020, 34, 1936–1962. [Google Scholar] [CrossRef]
  42. Dempster, A.; Petitjean, F.; Webb, G.I. ROCKET: Exceptionally fast and accurate time series classification using random convolutional kernels. Data Min. Knowl. Discov. 2020, 34, 1454–1495. [Google Scholar] [CrossRef]
  43. Zerveas, G.; Jayaraman, S.; Patel, D.; Bhamidipaty, A.; Eickhoff, C. A transformer-based framework for multivariate time series representation learning. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining, Virtual, 14–18 August 2021; pp. 2114–2124. [Google Scholar]
  44. Saito, T.; Rehmsmeier, M. The precision-recall plot is more informative than the ROC plot when evaluating binary classifiers on imbalanced datasets. PLoS ONE 2015, 10, e0118432. [Google Scholar]
Figure 1. The overall framework for abnormal MLF prediction.
Figure 1. The overall framework for abnormal MLF prediction.
Metals 16 00474 g001
Figure 2. Procedure for constructing the abnormal fluctuation index from liquid level signals. (a) The mold level is normalized before being processed to extract (b) abnormal fluctuations, which are then (c) extended over time and summed to obtain (d) the abnormal fluctuation index. The normalized mold level is a dimensionless signal. Different colors in panels (b,c) correspond to different threshold values of k.
Figure 2. Procedure for constructing the abnormal fluctuation index from liquid level signals. (a) The mold level is normalized before being processed to extract (b) abnormal fluctuations, which are then (c) extended over time and summed to obtain (d) the abnormal fluctuation index. The normalized mold level is a dimensionless signal. Different colors in panels (b,c) correspond to different threshold values of k.
Metals 16 00474 g002
Figure 3. Performance comparison of the Inception, HydraMultiRocket, TST and Informer under different parameter configurations in terms of (a) mean ROC-AUC and (b) mean PR-AUC. Lines are used to illustrate performance trends across ordered parameter configurations.
Figure 3. Performance comparison of the Inception, HydraMultiRocket, TST and Informer under different parameter configurations in terms of (a) mean ROC-AUC and (b) mean PR-AUC. Lines are used to illustrate performance trends across ordered parameter configurations.
Metals 16 00474 g003
Figure 4. Performance comparison of the model under different PSD configurations in terms of (a) mean ROC-AUC and (b) mean PR-AUC.
Figure 4. Performance comparison of the model under different PSD configurations in terms of (a) mean ROC-AUC and (b) mean PR-AUC.
Metals 16 00474 g004
Figure 5. Feature ablation experiment for the Informer model in terms of (a) mean ROC-AUC and (b) mean PR-AUC. The x-axis shows different feature combinations, and lines are used to visualize performance trends without implying continuity.
Figure 5. Feature ablation experiment for the Informer model in terms of (a) mean ROC-AUC and (b) mean PR-AUC. The x-axis shows different feature combinations, and lines are used to visualize performance trends without implying continuity.
Metals 16 00474 g005
Figure 6. (a) ROC curves and (b) PR curves of the best-performing model across different thresholds.
Figure 6. (a) ROC curves and (b) PR curves of the best-performing model across different thresholds.
Metals 16 00474 g006
Figure 7. Evaluation metrics for each position in the model output matrix in terms of (a) ROC-AUC, (b) PR-AUC, (c) proportion of labels equal to 1, (d) accuracy, (e) precision, (f) recall, and (g) F1 score.
Figure 7. Evaluation metrics for each position in the model output matrix in terms of (a) ROC-AUC, (b) PR-AUC, (c) proportion of labels equal to 1, (d) accuracy, (e) precision, (f) recall, and (g) F1 score.
Metals 16 00474 g007
Figure 8. Target and predicted sequences.
Figure 8. Target and predicted sequences.
Metals 16 00474 g008
Table 1. The best evaluation metrics for various models.
Table 1. The best evaluation metrics for various models.
ModelsMean ROC-AUCMean PR-AUC
Inception0.7550.328
HydraMultiRocket0.7120.282
TST0.7620.344
Informer-based (ours)0.7830.365
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Xin, X.; Fu, M.; Li, W.; Wang, H.; Wang, Q.; Lu, Y.; Wang, Z.; Bai, Y.B.; Gu, T.; Yu, C.; et al. Informer-Based Prediction of Mold Level Anomalies in Continuous Casting via Temporal and Frequency-Domain Features. Metals 2026, 16, 474. https://doi.org/10.3390/met16050474

AMA Style

Xin X, Fu M, Li W, Wang H, Wang Q, Lu Y, Wang Z, Bai YB, Gu T, Yu C, et al. Informer-Based Prediction of Mold Level Anomalies in Continuous Casting via Temporal and Frequency-Domain Features. Metals. 2026; 16(5):474. https://doi.org/10.3390/met16050474

Chicago/Turabian Style

Xin, Xin, Meixia Fu, Wei Li, Hongbing Wang, Qu Wang, Yifan Lu, Zhenqian Wang, Yuntian Brian Bai, Tao Gu, Changyuan Yu, and et al. 2026. "Informer-Based Prediction of Mold Level Anomalies in Continuous Casting via Temporal and Frequency-Domain Features" Metals 16, no. 5: 474. https://doi.org/10.3390/met16050474

APA Style

Xin, X., Fu, M., Li, W., Wang, H., Wang, Q., Lu, Y., Wang, Z., Bai, Y. B., Gu, T., Yu, C., & Wang, J. (2026). Informer-Based Prediction of Mold Level Anomalies in Continuous Casting via Temporal and Frequency-Domain Features. Metals, 16(5), 474. https://doi.org/10.3390/met16050474

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop