An Interpretable Time Series Forecasting Model for Predicting NOx Emission Concentration in Ferroalloy Electric Arc Furnace Plants

: Considering the pivotal role of ferroalloys in the steel industry and the escalating global emphasis on sustainability (e.g., zero emissions and carbon neutrality), the demand for ferroalloys is anticipated to increase. However, the electric arc furnace (EAF) of ferroalloy plants generates substantial amounts of nitrogen oxides (NOx) because of the high-temperature combustion processes. Despite the substantial contributions of many studies on NOx prediction from various industrial facilities, there is a lack of studies considering the environmental condition of the EAF in ferroalloy plants. Therefore, this study presents a deep learning model for predicting NOx emissions from fer-roalloy plants and further can provide guidelines for predicting NOx in industrial sites equipped with electric furnaces. In this study, we collected various historical data from the manufacturing execution system of electric furnaces and exhaust gas systems to develop a prediction model. Additionally, an interpretable artificial intelligence method was employed to track the effects of each variable on the NOx emissions. The proposed prediction model can provide decision support to reduce NOx emissions. Furthermore, the interpretation of the model contributes to a better understanding of the factors influencing NOx emissions and the development of effective strategies for emission reduction in ferroalloys EAF plants.


Introduction
The escalating global emphasis on sustainability, such as zero emissions, has led to changes in various industries, including the steel industry [1].A wide range of policies, strategies, protocols, and interventions related to emission reductions for specific air pollutants have been implemented globally [2].Particularly, in the Republic of Korea, owing to increasingly stringent environmental regulations, government agencies have installed sensors in stacks for telemonitoring and regulating factories that emit environmental pollutants [3].Additionally, the demand for ferroalloy, which is an essential raw material for the steel industry, is anticipated to increase not only because of its importance in manufacturing steel but also because of evolving production technologies aimed at reducing emissions [1].Ferroalloys are iron alloys with a high proportion of one or more elements, such as manganese (Mn), aluminum (Al), and silicon (Si), which enhance the characteristics of steel and cast iron or serve essential functions in the manufacturing process [4].Although ferroalloy production using electric arc furnaces (EAFs) results in lower emissions than steel production using blast furnaces, the ferroalloy production process still generates a considerable amount of NOx emissions, a substantial portion of which originates from EAFs themselves [5].As NOx is a major contributor to air pollution and a factor that affects human health [6], there has been an increasing focus on research to predict and reduce NOx emissions from various facilities [7][8][9].However, despite the high demand for ferroalloys and the significant NOx emissions associated with their production, there is a lack of research on predicting NOx emissions from EAFs, which are the primary sources of NOx at production sites.In this study, we developed a deep learning-based time series prediction model to predict NOx emissions from electric furnaces.For this purpose, data were collected from electric furnace and exhaust gas equipment at a ferroalloy production site.Furthermore, to interpret the deep learning-based prediction model, we used interpretable artificial intelligence techniques to identify the variables that have a significant impact on NOx prediction in the electric furnace environment.
Predicting NOx emissions in EAFs can facilitate the reduction in emissions through both pre-management, by adjusting key operating variables in production facilities, and post-management, by enhancing denitrification facilities for efficient NOx removal from exhaust gases.However, predicting NOx emissions in EAFs is challenging, owing to the severe internal environment and complex combustion reactions [10,11].EAFs produce various exhaust gases and particulate matter during high-temperature combustion.Additionally, the inside of the furnace chimney is exposed to a hot, humid environment containing a mixture of various gases.Owing to these conditions, it is difficult to accurately predict the NOx concentration within EAFs.There are two main approaches to predicting NOx emissions [9,12].The first one is the mechanism-based calculation approach, which involves various parameters and empirical formulas for heat transfer, combustion, and turbulence [12][13][14][15].However, this approach to NOx prediction requires various assumptions and time to simulate the combustion process and predict NOx emissions, making it challenging to model a combination of various factors inside EAFs [9,12,16].
The second approach is a data-driven method that establishes the relationships between NOx emissions and output variables based on data [9,10].Compared with the first approach, this data-driven approach does not need to solve complex equations [12].In this regard, many studies have employed data-driven methods to predict NOx emissions by applying a deep belief network (DBN) [9], artificial neural network (ANN) [17], extreme learning machine (ELM) [8], and long short-term memory (LSTM) [7] to various facilities such as coal-fired boilers, cement precalcining kilns, and industrial waste incinerators [7,17,18].Although these data-based approaches have proven to be effective in predicting NOx emissions in diverse applications, applying data-based approaches to the EAFs of ferroalloy production facilities has certain limitations.First, these approaches were designed without considering the specific environment of the EAFs in ferroalloy production facilities.EAFs have unique environmental conditions compared to other high-temperature industrial settings, including higher temperatures from electric arcs (commonly reach 2000 • C) and reliance on electrical energy instead of fossil fuels [19,20].Unlike the constant conditions found in other combustion processes, EAFs involve a dynamic process for producing products.This includes adding raw materials, removing slag, and adjusting alloy compositions even during operation.In addition, the absence of a data collection system to gather the necessary information from ferroalloy production systems presents another challenge.Identifying the data that are essential for the accurate prediction of NOx emissions from EAFs remains unclear.Therefore, further research is required to develop NOx emission prediction models that account for the unique conditions within EAFs.Second, many previous studies have employed machine learning-based models for NOx prediction, focusing only on predictive performance analysis.For the successful collaboration between experts and machine learning technology, the key factor is interpretability [21].This ensures that behaviors of the model and predictions are understandable to humans, facilitating further application for EAFs.Accordingly, for NOx prediction and reduction, it is necessary to analyze the critical factors for NOx prediction and to understand the behaviors of the predictive models.
This study conducts a data-driven NOx emission prediction that is suitable for the exhaust gas generation mechanism in the EAFs of ferroalloy production facilities.Additionally, the study employs interpretable artificial intelligence (AI) algorithms to identify variables that contribute significantly to NOx emissions, thereby proposing an interpretable model that considers the characteristics of EAFs to predict NOX emissions.Therefore, this study provides guidance for constructing NOx prediction systems and data collection systems for EAFs and can also be utilized to support the installation and operation of denitrification equipment for NOx reduction by providing NOx emissions from EAFs and furnishing data on NOx emissions from EAFs.It also offers insights into the factors influencing these emissions, facilitating the environmental management and efficient control of NOx emissions during ferroalloy production.

Operation and Gas Exhaustion Process of the Electric Arc Furnace
EAFs are primarily used to produce ferroalloys.The ferroalloy production process consists of raw material transportation, raw material pretreatment, electric furnace melting, refining, and casting.In EAFs, raw materials such as iron scrap are melted and refined; subsequently, oxidizing slag is produced to remove impurities from the molten pool [20].An EAF is a sealed structure in which raw materials are charged and electrodes are inserted after sealing with a cover.As illustrated in Figure 1, the melting of the raw materials begins with an arc discharge.Electrical energy is supplied through the graphite electrodes, creating a powerful electric arc between the electrodes and raw materials.This intense electric arc, with its strong voltage, serves as the primary heat source for melting the material.The internal temperature of the furnace is controlled by adjusting the position of the electrodes.During the melting process, EAFs emit thermal NOx and other gases because of the high temperatures generated, and NOx emission is predominantly concentrated in this melting process.
This study conducts a data-driven NOx emission prediction that is s exhaust gas generation mechanism in the EAFs of ferroalloy production f tionally, the study employs interpretable artificial intelligence (AI) algorith variables that contribute significantly to NOx emissions, thereby proposin able model that considers the characteristics of EAFs to predict NOX emissi this study provides guidance for constructing NOx prediction systems and systems for EAFs and can also be utilized to support the installation an denitrification equipment for NOx reduction by providing NOx emissions f furnishing data on NOx emissions from EAFs.It also offers insights into th encing these emissions, facilitating the environmental management and e of NOx emissions during ferroalloy production.

Operation and Gas Exhaustion Process of the Electric Arc Furnace
EAFs are primarily used to produce ferroalloys.The ferroalloy prod consists of raw material transportation, raw material pretreatment, electric ing, refining, and casting.In EAFs, raw materials such as iron scrap are m fined; subsequently, oxidizing slag is produced to remove impurities from t [20].An EAF is a sealed structure in which raw materials are charged and inserted after sealing with a cover.As illustrated in Figure 1, the melting of rials begins with an arc discharge.Electrical energy is supplied through the trodes, creating a powerful electric arc between the electrodes and raw m intense electric arc, with its strong voltage, serves as the primary heat sou the material.The internal temperature of the furnace is controlled by adju tion of the electrodes.During the melting process, EAFs emit thermal NOx a because of the high temperatures generated, and NOx emission is predomi trated in this melting process.In this study, we collected data from EAFs and exhaust gas emissio ferroalloy production site in South Korea.Figure 2 illustrates the exhaust generated in the EAFs considered in this study.By melting the raw materi erate a significant amount of thermal NOx.Subsequently, the dust duct cap lects the exhaust gases and fine particulates generated in the preceding exhaust gases are then directed to a semi-dry reactor (SDR), where wate control the temperature of the exhaust gas.In the SDR, further treatment, furization, occurs to remove additional pollutants from the exhaust befo In this study, we collected data from EAFs and exhaust gas emission facilities at a ferroalloy production site in South Korea.Figure 2 illustrates the exhaust gas emissions generated in the EAFs considered in this study.By melting the raw materials, EAFs generate a significant amount of thermal NOx.Subsequently, the dust duct captures and collects the exhaust gases and fine particulates generated in the preceding processes.The exhaust gases are then directed to a semi-dry reactor (SDR), where water is injected to control the temperature of the exhaust gas.In the SDR, further treatment, such as desulfurization, occurs to remove additional pollutants from the exhaust before release.
The role of SDRs extends beyond temperature control to actively reduce the concentrations of various harmful substances in exhaust gas.Bag filters remove particulate pollutants (e.g., dust) but not gaseous pollutants (e.g., NOx and SOx).This phase primarily focuses on eliminating fine particulates from the gas stream.An induced draft fan (IDF) is operated to expel gases from outside the stack using pressure.The IDF creates a suction effect that ensures the efficient and effective discharge of gases, thereby minimizing the emission of untreated or partially treated exhaust gases into the environment.
athematics 2024, 12, x FOR PEER REVIEW 4 role of SDRs extends beyond temperature control to actively reduce the concentratio various harmful substances in exhaust gas.Bag filters remove particulate pollutants dust) but not gaseous pollutants (e.g., NOx and SOx).This phase primarily focus eliminating fine particulates from the gas stream.An induced draft fan (IDF) is ope to expel gases from outside the stack using pressure.The IDF creates a suction effec ensures the efficient and effective discharge of gases, thereby minimizing the emissi untreated or partially treated exhaust gases into the environment.

Data-Driven NOx Emissions Prediction Research
Owing to the complex mechanism of NOx emissions from facilities involving temperature processes, such as coal-fired power plants, research has been conducte predicting NOx emissions from exhaust gases (Table 1).Research has been conduct predict NOx emissions by utilizing computational fluid dynamics (CFD) simulatio generate data on flow, temperature, and chemical reactions within the furnace.Far et al. [22] proposed a method for predicting NOx emissions from gas/oil boilers by u ing CFD to obtain data on flow, temperature, and stoichiometry within the furnace.simplified the conditions with an ideal reactor network which is interconnected and fectly stirred or plug flow reactors to predict NOx emissions using a detailed k scheme.Likewise, Lv et al. [23] utilized CFD simulations to generate 3D NOx spatia tribution data and applied extreme learning machine modeling for accurate predic of NOx distribution in the furnace.This study partitioned data based on NOx gener mechanisms for enhancing model accuracy and provided a detailed approach for prediction in furnace environments.However, a potential limitation of these studies requirement for fluid dynamics experts to effectively use CFD, requiring expertise in dling diverse parameters and empirical formulas for heat transfer, combustion, and bulence specific to each facility's environmental conditions.This complexity can challenges for implementation in areas with limited research, such as EAFs, due t variability in environmental conditions across different facilities.
Compared to the challenges associated with CFD studies, research utilizing driven methods has been conducted to establish the relationship between operationa iables and NOx generation, thereby enabling the prediction of NOx emissions from ous facilities with less difficulty.Wang, Ma, Wang, Li, and Zhang [9] proposed a me for data acquisition and NOx emission prediction in coal-fired power plants using D based models utilizing historical operating data.Tang, Wang, Chai, Cao, Ouyang, a [8] proposed an autoencoder ELM model to predict NOx emission concentrations fr coal-fired boiler.In their study, an autoencoder was utilized to extract hidden fea from the variables of operational data, and an ELM model was then applied to pr

Data-Driven NOx Emissions Prediction Research
Owing to the complex mechanism of NOx emissions from facilities involving hightemperature processes, such as coal-fired power plants, research has been conducted on predicting NOx emissions from exhaust gases (Table 1).Research has been conducted to predict NOx emissions by utilizing computational fluid dynamics (CFD) simulations to generate data on flow, temperature, and chemical reactions within the furnace.Faravelli et al. [22] proposed a method for predicting NOx emissions from gas/oil boilers by utilizing CFD to obtain data on flow, temperature, and stoichiometry within the furnace.They simplified the conditions with an ideal reactor network which is interconnected and perfectly stirred or plug flow reactors to predict NOx emissions using a detailed kinetic scheme.Likewise, Lv et al. [23] utilized CFD simulations to generate 3D NOx spatial distribution data and applied extreme learning machine modeling for accurate predictions of NOx distribution in the furnace.This study partitioned data based on NOx generation mechanisms for enhancing model accuracy and provided a detailed approach for NOx prediction in furnace environments.However, a potential limitation of these studies is the requirement for fluid dynamics experts to effectively use CFD, requiring expertise in handling diverse parameters and empirical formulas for heat transfer, combustion, and turbulence specific to each facility's environmental conditions.This complexity can pose challenges for implementation in areas with limited research, such as EAFs, due to the variability in environmental conditions across different facilities.
Compared to the challenges associated with CFD studies, research utilizing datadriven methods has been conducted to establish the relationship between operational variables and NOx generation, thereby enabling the prediction of NOx emissions from various facilities with less difficulty.Wang, Ma, Wang, Li, and Zhang [9] proposed a method for data acquisition and NOx emission prediction in coal-fired power plants using DBN-based models utilizing historical operating data.Tang, Wang, Chai, Cao, Ouyang, and Li [8] proposed an autoencoder ELM model to predict NOx emission concentrations from a coal-fired boiler.In their study, an autoencoder was utilized to extract hidden features from the variables of operational data, and an ELM model was then applied to predict NOx emissions from the hidden features.Zhang, Wang, Shao, Duan, and Hou [17] utilized an ANN to predict NOx in cement precalcining kilns and a genetic algorithm to search for optimal operation parameters to achieve the lowest concentration of nitrogen oxide emissions.However, previous studies utilizing ANN-based models have certain limitations.This is because they do not utilize the temporal dynamics of the operating variables in facilities, which can contribute to NOx emissions.As the combustion of an EAF is a dynamic process, EAFs' working conditions are correlated with historical NOx emissions.Given that manufacturing execution systems (MESs) and telemonitoring systems (TMSs) store dynamic time series data, previous time series data can be leveraged to develop prediction models.Safdarnejad et al. [24] developed a dynamic data-driven model for a coal-fired utility boiler to estimate NOx and CO emissions simultaneously, utilizing recurrent neural networks to capture time series characteristics of the data.Yang, Wang, and Li [7] focused on using LSTM networks to model the relationship between the operational parameters and NOx emissions in a 660 MW boiler.To enhance NOx emissions prediction in diesel engine transient environments, Shen et al. [25] proposed a prediction model based on a hybrid neural network architecture that combines the feature extraction capabilities of a convolutional neural network (CNN) with the time series prediction proficiency of LSTM networks.In addition to models considering the temporal dynamics of the operating variables in facilities, research has been conducted to modify the characteristics and purposes of prediction in facilities or enhance the performance of existing models.To improve the efficiency of the denitrification process in power plants, Wang, Peng, Cao, Zhou, Fan, Li, and Huang [12] proposed a modeling method using a random forest algorithm for the dimensionality reduction in input data and a lightweight CNN.In their study, satisfactory NOx predictive performance was obtained.A lightweight CNN is preferred over a high-performance CNN, which requires numerous parameters and floating-point operations.Lightweight CNN could offer the advantage of efficient computation and reduced complexity, making them more suitable for real-time NOx emission prediction tasks in coal-fired boilers.Li et al.'s [25] study presents a CNN-based model for the accurate prediction of NOx emissions from a coal-fired power plant boiler.An attention mechanism was integrated into the CNN-based model, with the attention module focusing on the interdependencies between channels in the input feature maps to capture important information in latent space.
Though previous studies have proposed data-driven NOx prediction methods for facilities with combustion systems, research on predicting NOx emissions from EAFs is still lacking.Consequently, data-acquisition systems tailored for NOx prediction in EAF environments are lacking.An EAF generates extremely high temperatures to melt raw materials, and owing to the characteristics of the molten pool during ferroalloy production, noise is generated when measuring the exhaust gases emitted during ferroalloy production.As the gas trapped beneath the slag layer in the molten pool and the collapse of charged raw materials can lead to sudden explosions and a rapid increase in NOx emissions, it is necessary to smooth the NOx emission values before utilizing them as training data for the prediction model.In addition, exhaust gases in ferroalloy production facilities motion at high speeds in hot and humid environments.In such an environment, data collected by IoT sensors in a pipe may contain noise, owing to various factors.Thus, data preprocessing techniques are required to construct training data for the prediction mode by smoothing the noise.To smooth out noise or outliers, a Kalman filter is used to estimate the current state from past measurements and correct outlier data based on the distribution of the given data.

Interpretable Prediction Models
Despite the contributions of previous studies to the prediction of NOx emissions in various combustion processes, an interesting yet unexplored angle still exists.In the case of deep-learning-based prediction models, numerous studies have focused on performance analysis, making it difficult to track the impact of input variables on NOx emissions.As deep learning-based predictions rely solely on black-box models with undisclosed internal mechanisms, experts in decision making have experienced challenges in utilizing these predictive models [27].
Interpretable artificial intelligence methods are processes that provide interpretability in a form understandable by humans, based on the explainability of how a model works [28].They can be classified based on the complexity of the model into post hoc and intrinsic approaches [29].The intrinsic approach involves models that are naturally interpretable due to their simple structure (e.g., decision trees, linear SVMs).On the other hand, the post hoc approach is applied after the model has been trained, focusing on the analysis and interpretation of the model's behavior.LIME (local interpretable model-agnostic explanations) and SHAP (SHapley Additive exPlanations) are well-known methods, offering insights into how the model makes its predictions.Both approaches are model-agnostic and can be utilized across various models.LIME focuses on local explanations, offering insights into the interpretation process for specific data points, but it has limitations in providing global interpretations and consistency in the contribution of input variables [30].SHAP similarly allows for an understanding of individual contributions to predictions across the entire dataset, but this approach can offer a broader analysis of model predictions, such as feature importance [31].Consequently, it has been utilized across various domains for its comprehensive insights into model behavior [32,33].
Considering the ability to provide model-agnostic interpretations and both global and local explanations [21,31], therefore, this study utilizes SHapley Additive exPlanations (SHAP) to uncover the inner workings of a machine learning model for time series data to predict NOx emissions from EAFs.This study constructs a model that reflects the relationships between input variables over time and employs preprocessing techniques specific to the features of EAFs to build the training dataset.Additionally, interpretable AI is utilized to analyze the impact of the input variables on NOx emission predictions.

Kalman Filter-Based Smoothing Algorithm
Owing to the extreme environment in the chimney, the NOx data collected by the sensors often contain noise.To address this issue, a Kalman-filter-based smoothing algorithm is introduced to mitigate sensor noise, remove outliers, and enhance the quality of the collected data to train the prediction model [34].Kalman filtering is a method for estimating the state of a dynamic system [35,36].It predicts the next state based on the current state and subsequently updates the predicted state using new measurements.The mathematical model can be expressed as follows: X k is the state vector representing the system's state at time k.Y k is the measurement at time k.Q is the process noise variance.R is the measurement variance.P k is the error covariance matrix at time k.K k is the Kalman gain at time k.
The Kalman gain adjusts the confidence between the current prediction and observed data, thereby determining the optimal state correction.Therefore, a higher Kalman gain value places more trust in the observed data and less emphasis on prediction, allowing the Kalman filter to estimate and predict the system state more accurately.The state variables of the system are estimated using measured data.The measurement data sequence is used as the input to estimate the state of the system, and the Kalman filter-based smoothing algorithm is performed as follows (Algorithm 1):

Long Short-Term Memory Network
NOx emissions in EAFs represent a time series issue because of the relationship between the past operating conditions and the current state.Since NOx emissions during combustion in an EAF is a non-linear and complex process [15], variables and temporal factors should be considered.Given the time series nature of NOx emissions in EAFs, an LSTM neural network-based model is adopted (Figure 3).Owing to the ability of the LSTM network to remember long-term dependencies, it can capture patterns in emission data [7], and it has been increasingly utilized in various time series prediction domains [7,34,37].LSTM networks employ an unique architecture that uses structures known as gates to regulate a value called the cell state (C).The cell state acts as the memory for the network, which is crucial for retaining and carrying relevant information throughout the data sequence.The ability of LSTM networks to use gates to regulate cell states is crucial, and this mechanism allows the network to selectively retain or discard information, thereby enhancing its efficiency in analyzing time series data.The forget gate in LSTM networks uses a sigmoid function to assess previous outputs (h t−1 ) and current inputs (i t ), determining which past information to retain or discard from the cell state.The input gate in the LSTM network updates cell state C t .It employs a sigmoid function to identify which elements of the current input are significant and identifies a tangent hyperbolic function to generate a vector of new candidate values, ∼ C t .These elements are integrated to update C t using essential new information.The output gate determines the final output h t by filtering the cell state ∼ C t through a tangent hyperbolic function and then multiplying it by the output of the sigmoid function.This selectively updates h t with the relevant information from C t .Each gate in the LSTM network operates according to the following formula: The forget gate : The input gate : Alongside The output gate : combustion in an EAF is a non-linear and complex process [15], variables and temporal factors should be considered.Given the time series nature of NOx emissions in EAFs, an LSTM neural network-based model is adopted (Figure 3).Owing to the ability of the LSTM network to remember long-term dependencies, it can capture patterns in emission data [7], and it has been increasingly utilized in various time series prediction domains [7,34,37].LSTM networks employ an unique architecture that uses structures known as gates to regulate a value called the cell state (C).The cell state acts as the memory for the network, which is crucial for retaining and carrying relevant information throughout the data sequence.The ability of LSTM networks to use gates to regulate cell states is crucial, and this mechanism allows the network to selectively retain or discard information, thereby enhancing its efficiency in analyzing time series data.The forget gate in LSTM networks uses a sigmoid function to assess previous outputs (ℎ ) and current inputs ( ), determining which past information to retain or discard from the cell state.The input gate in the LSTM network updates cell state  .It employs a sigmoid function to identify which elements of the current input are significant and identifies a tangent hyperbolic function to generate a vector of new candidate values,  .These elements are integrated to update  using essential new information.The output gate determines the final output ℎ by filtering the cell state  through a tangent hyperbolic function and then multiplying it by the output of the sigmoid function.This selectively updates ℎ with the relevant information from  .Each gate in the LSTM network operates according to the following formula: The forget gate: The input gate: The output gate: () = ( 7)  denotes the state of the LSTM cells at time t, and ℎ denotes the output of the unit at time t. denotes the weight parameter metrics. ,  , and  denote the forget, input, and output gates and state vector at time t.⊗ represents element-wise multiplication.C t denotes the state of the LSTM cells at time t, and h t denotes the output of the unit at time t.W denotes the weight parameter metrics.f t , i t , and o t denote the forget, input, and output gates and state vector at time t.⊗ represents element-wise multiplication.When applied to EAFs, utilizing their strengths in learning the sequence of features [7,25,34,37], the LSTM network can offer advantages in improving the accuracy of predicting NOx emissions typically associated with the operations of these furnaces.

Delay Time Determination
At Korean ferroalloy production sites, TMSs are commonly used to measure the NOx concentrations in EAFs [3].Throughout the processes of NOx generation, detection, and control, numerous parameters are monitored using the MES, as depicted in Figure 2.However, these parameters are not measured simultaneously by the sensors, which leads to inherent delays in data acquisition.In addition, the combustion processes in EAFs involve complex reactions that occur over time and can influence NOx emissions.Given that changes in the variables within the process do not immediately affect the NOx emissions, it is necessary to select an appropriate delay time between the variables.This helps determine the suitable length of the sequence to be input into the LSTM model, which is suitable for processing and predicting events with intervals and delays in a time series [38].The delay time selection method based on mutual information (MI) focuses on identifying the most effective sequence length from the operation variables to predict NOx emissions.Tang, Wang, Chai, Cao, Ouyang, and Li [8] determined the delay time between each feature and NOx emission concentration using the MI method.This is achieved by maximizing the combined MI between the input features and target variable.MI is an information theory measure that quantifies the amount of information obtained from one random variable by observing another [39].MI is frequently used to evaluate the dependence or correlation between variables, capturing insights that traditional regression analyses may not reveal.Here, MI serves as a metric for measuring the extent to which one variable informs another, thereby indicating their level of interdependence.X = [x 1 , x 2 , . .., x n ], and n is the number of samples in dataset X. H(X) represents the information entropy of random variable x.The probability distribution of x i is p(x i ).H(X,Y) is the joint entropy of X and Y.The probability density functions of x and y are p(x) and p(y).The degree of correlation between the two random variables can be expressed by the MI as follows [40]: I(x; y) = ∑ y ∑ xϵX p(x, y)log p(x, y) p(x)p(y) To determine the delay time, it is varied starting from one step, and the time step that yields the highest MI is selected.By analyzing the MI between the variables and the NOx emission concentration, it is possible to determine the maximum feasible delay time for all input variables.

NOx Emission Prediction Model Development
This study develops a model to predict future NOx emissions using a sequence of data comprising 19 variables, including NOx emissions.To capture the trend of previously emitted NOx levels, NOx emissions are utilized as predictive variables.The performance of the NOx prediction model is assessed using quantitative performance evaluation metrics.The mean absolute percentage error (MAPE) measures the average percentage error between predicted and actual values.The R-squared (R 2 ) score, or the coefficient of determination, indicates how well the predicted values fit the actual data, with a score of 1 representing a perfect fit.The mean squared error (MSE) quantifies the average of the squares of errors and measures the variance of the prediction errors.The mean absolute error (MAE) measures the average magnitude of errors between the predicted and actual values without considering direction.

Interpretation of the NOx Emissions Prediction
Although machine-learning-based models have been adopted in various domains, their black-box nature, which enables powerful predictions, presents a key impediment in that AI-based systems often lack interpretability and need interpretable machine learning [27].To address the lack of interpretability of complex and nonlinear machine-learningbased models, the post hoc interpretation method employs a model-agnostic method to explain how certain features contribute to predictions and the model's behavior [21].Among the various interpretation methods, SHAP is a widely used framework for interpreting the predictions of machine learning models based on the Shapley value of the conditional expectation of a model [41,42].
SHAP evaluates the feature importance using additive feature attribution methods, as illustrated in Equation (22).
Let f be the original predictive model to be explained and g be the explanation model.Where z ′ ∈ {0, 1} M is a coalition vector that indicates whether the ith feature is present (=1) or absent (=0), M is the number of features, ϕ i ∈ R is the importance value of the ith feature, and ϕ 0 is the baseline outcome without any feature.Specifically, SHAP identifies the importance of each feature as a change in the expected model prediction when conditioning on that feature and explains how to change from the base value E[f (z)] to the current output f (x).SHAP averages the ϕ i values across all possible ordering.Hence, when defining f x (S) = E[ f x |x s ] for a subset of features (S), the SHAP value (ϕ i ) is expressed as in Equation (23).
where f x (S {x i }) and f x (S) are the model prediction with and without the ith feature.SHAP is an additive feature attribution method when ϕ 0 equals f x (ϕ), representing the baseline prediction with no features.The original model's prediction for each sample is equal to the sum of all the feature SHAP values.Thus, the SHAP values indicate the contribution of each feature to the predictions of the model.Calculating the precise SHAP value poses a challenge due to the necessity of evaluating each potential feature subset, resulting in exponential computational complexity [21].Therefore, we utilized deep SHAP, a method that aggregates SHAP values calculated for individual network components to derive SHAP values for the entire network [42,43].Using deep SHAP, we obtained the SHAP values for each feature.The absolute SHAP value of the ith feature for the jth time-steps is expressed as in |ϕ i,j |, and the SHAP value of the ith feature ϕ i is the average of ϕ i,j .

Data Preparation
Data were collected from the MES and TMS of EAFs, and 18,834 data points were collected from 1 May 2023 to 7 July 2023.Among them, 17,422 data points were used for training and 1412 for validation; 1412 data points were collected from 12 July to 17 July for the test dataset.Given that the model based on deep learning demands numerous variables and substantial data, long-term observation and data collection are essential.However, due to the nature of the data collected by sensors, there can be gaps, and there may be times when data are not collected due to operational schedules.Therefore, for research purposes, it is crucial to collect long-term data without gaps across many variables.To build the training dataset for model learning, the NOx emission measurement data were smoothed using a Kalman filter-based smoothing algorithm.Increasing the value of the measurement variance (R) gives more weight to noise in the observed data.In this case, the Kalman filter is less influenced by the predicted values, and the resulting graph is smoother, following the volatility of the observed data more closely.In this study, R was set as 10 2 .Increasing the value of the process noise variance (Q) results in greater uncertainty in the system.In this case, the Kalman filter considered the predictions to be more uncertain.Consequently, the graph maintains higher volatility in the predictions.In this study, Q was set as 5 2 .The initial error covariance matrix (P) affected the initial state prediction of the Kalman filter.Increasing the initial error covariance matrix (P) value increases the uncertainty of the initial prediction, resulting in a larger initial prediction error.In this study, hyperparameters R, P, and Q were selected as trials and errors.We set the sum of the R and Q so as not to exceed the actual variance of NOx, which is 17.75 2 .To smooth the fluctuations in the graph, R was maintained larger than Q.P was set to 0.7, based on the initial measurements' difference, which was approximately 1.3.The smoothed NOx data are shown in Figure 4. We selectively illustrated key examples, as visualizing all data points would obscure this effect.
for the test dataset.Given that the model based on deep learning demands nume iables and substantial data, long-term observation and data collection are essent ever, due to the nature of the data collected by sensors, there can be gaps, and t be times when data are not collected due to operational schedules.Therefore, for purposes, it is crucial to collect long-term data without gaps across many vari build the training dataset for model learning, the NOx emission measurement d smoothed using a Kalman filter-based smoothing algorithm.Increasing the val measurement variance (R) gives more weight to noise in the observed data.In the Kalman filter is less influenced by the predicted values, and the resulting smoother, following the volatility of the observed data more closely.In this stud set as 10 .Increasing the value of the process noise variance (Q) results in great tainty in the system.In this case, the Kalman filter considered the predictions to uncertain.Consequently, the graph maintains higher volatility in the prediction study, Q was set as 5 .The initial error covariance matrix (P) affected the initial diction of the Kalman filter.Increasing the initial error covariance matrix (P) creases the uncertainty of the initial prediction, resulting in a larger initial predict In this study, hyperparameters R, P, and Q were selected as trials and errors.W sum of the R and Q so as not to exceed the actual variance of NOx, which is 1 smooth the fluctuations in the graph, R was maintained larger than Q.P was s based on the initial measurements' difference, which was approximately smoothed NOx data are shown in Figure 4. We selectively illustrated key exam visualizing all data points would obscure this effect.

NOx Emission Prediction
A comparison of the MI between NOx emissions and variables was conduct termine the appropriate sequence length for the prediction model input.As sho ble 2, each variable had a range of delay times from 1 to 6 steps, and each step w This design, resulting in a maximum delay time of 30 min, was influenced by re standards mandating emissions monitoring over 30 min intervals in Korea.

NOx Emission Prediction
A comparison of the MI between NOx emissions and variables was conducted to determine the appropriate sequence length for the prediction model input.As shown in Table 2, each variable had a range of delay times from 1 to 6 steps, and each step was 5 min.This design, resulting in a maximum delay time of 30 min, was influenced by regulatory standards mandating emissions monitoring over 30 min intervals in Korea.
To capture the changes in each variable over time, we selected six steps as inputs for the prediction model.The number of units was chosen from the range [64, 128, 256, 512], and the numbers of LSTM and dense layers were varied to identify the optimal number of units that yielded the highest performance.The output of an LSTM layer is a highdimensional feature vector that cannot be directly used to predict a single NOx emission value.Therefore, a dense layer was employed, wherein each input node was connected to every output node.This setup transformed the LSTM layer's output into a single, predictive NOx emission value.After analyzing the performance evaluation metrics in the pilot experiments, two LSTM layers and one dense layer were used (Figure 5), and the optimal units for each layer were determined as follows: LSTM1 (128), LSTM2 (64), and dense layer (64).As illustrated in Figure 6, the red line representing the predicted values from the model closely followed the dotted line representing the actual NOx emissions.The alignment of these two lines suggests that the model can effectively predict NOx emissions.Figure 7 shows a scatter plot of the prediction models from the test data, where each dot represents an individual prediction against the actual value.The linear fit line indicates the trajectory of the predicted value, and the perfect prediction line in dashed red represents the ideal points at which the predicted values would match the actual values.The 95% prediction band indicates the area in which 95% of the predicted values lie, thus demonstrating the consistency of the model.A narrow 95% prediction band signifies concentrated, accurate predictions within the confidence interval, reflecting a model's consistent output.Conversely, a wide band indicates greater uncertainty and dispersed predictions.The performance of the prediction model is shown in Table 3, and a comparison analysis was conducted to observe the effects of the presence of previous NOx emissions and temporal factors.This analysis revealed that incorporating the previous NOx emissions and temporal factors yielded better results, as reflected by the improved performance metrics.The 'Model without NOx' did not utilize the previous NOx emissions values, indicating that incorporating past NOx emissions data is indeed valuable.The 'Model with only NOx' showed satisfactory performance.It seems that including NOx as a feature is crucial.'Linear Regression', 'Deep Neural Network (DNN)', 'Gradient Boosting Regression', and 'Random Forest Regression' employed the same variables as the proposed model.However, due to the nature of their models, they did not incorporate the temporal aspect.This table demonstrates the effectiveness of using a model capable of reflecting temporal elements and leveraging previous NOx emissions data.sions and temporal factors yielded better results, as reflected by the improved performance metrics.The 'Model without NOx' did not utilize the previous NOx emissions values, indicating that incorporating past NOx emissions data is indeed valuable.The 'Model with only NOx' showed satisfactory performance.It seems that including NOx as a feature is crucial.'Linear Regression', 'Deep Neural Network (DNN)', 'Gradient Boosting Regression', and 'Random Forest Regression' employed the same variables as the proposed model.However, due to the nature of their models, they did not incorporate the temporal aspect.This table demonstrates the effectiveness of using a model capable of reflecting temporal elements and leveraging previous NOx emissions data.

Interpretation of the NOx Emission Prediction
The SHAP algorithm was applied to the constructed model to calculate the importance of each variable over time.Specifically, SHAP assigns an importance value to each feature for each prediction, based on additive feature attribution methods that comply with a set of variables.In the test dataset, 1000 data points were randomly selected to derive SHAP values.The results of SHAP analysis provide information on how variables influence the model's predictions but do not directly indicate causality.Therefore, it is important to be aware of this limitation when interpreting the results obtained from SHAP analyses.The average absolute SHAP values for each variable were calculated and plotted to visually represent the impact of these variables on the NOx prediction at different time points (Figure 8).The purpose of the various colors is to distinguish between variables, and therefore, colors are unrelated to whether something is worse or better.Based on the SHAP analysis, the temperature measured in the dust duct and SDR before passing through the SDR device and the NOx emissions at the previous time-steps contributed to the predictions.In the semi-dry reactor (SDR), there is a process where liquid is sprayed into the exhaust gas to lower its temperature.Indeed, a noticeable difference in the area between the SDR inlet temperature and the SDR outlet temperature can be observed.This suggests that the contributions of the semi-dry reactor inlet temperature and the dust duct temperature, which are related to the temperature of the exhaust gas before passing through the semi-dry reactor, may be linked to the actual NOx emissions.
Summary and bar plots were employed to illustrate how the input features contributed to the predicted output values (Figures 9 and 10).Summary plots allow us to understand the global trend of the SHAP values of a feature.Specifically, the summary plots show the distribution of SHAP values for each feature.Each point represents the SHAP value of the feature for an individual prediction.Points moving to the right indicate a positive impact on the model output, whereas points to the left indicate a negative impact.Red points represent "high" NOx emissions, whereas blue points represent "low" NOx emissions.The bar plots represent the importance of the features; their importance decreases from top to bottom.Figure 8 depicts how different variables affect NOx predictions across time.Figure 9 examines the variables' impact on lower NOx emission data points, whereas Figure 10 focuses on higher emission points.Thus, while Figure 8 offers a global view of variable impacts over time, Figures 9 and 10 provide more local insights into their effects at particular emission levels.Figures 9 and 10 show the average SHAP values for each feature in the bar graph.The bar lengths indicate the importance of the features, with longer bars indicating more important features.To derive the SHAP values for both low and high NOx emission levels, we selected 200 data points for each category from the test dataset.The first 200 data points were designated to represent the low NOx emission level, while data points from the 1000th to the 1200th position were chosen to represent the high NOx emission level.Figure 9a shows the summary plots, and Figure 9b shows the bar plots when the NOx emissions are low.Features such as the induced draft fan power, induced draft fan inlet pressure, and bag filter differential pressure were identified as important features when NOx emissions were low. Figure 9b illustrates the variables with high contributions at points of low NOx emissions.The exhaust gas facilities maintain pressure to discharge exhaust gas outside the chimney.Observing that variables such as induced draft fan power, induced draft fan inlet pressure, and bag filter differential pressure have high contributions, it is apparent that at points of low NOx emissions, the internal pressure of the exhaust gas facilities has a greater influence than temperature or operational variables.Figure 10a shows the summary plots, and Figure 10b shows the bar plots when the NOx emissions are high.Features such as exhaust NOx, bag filter differential pressure, and bag filter inlet pressure were identified as important when the NOx emissions were high.As can be seen in Figure 10b, the NOx emissions from previous points have a very high impact.Therefore, it can be inferred that there is some inertia effect with the emissions at a particular level.Figure 11 represents the actual temperature of the dust duct at data points where NOx levels are low and high.In Figure 11, when comparing the temperature in the dust duct, which collects gases emitted from the electric arc furnace (EAF), across two segments, it is observed that there is about a twofold difference.Considering both Figures 8 and 11, they suggest a possible correlation between NOx emissions and the temperature in the exhaust gas system.However, the direct comparison of the SHAP value between low and high emission levels may not entirely reflect an equal analysis due to the dataset containing a higher number of samples at high NOx emission levels.Despite the dataset's imbalance, the figures reveal the relationships between features and the target, offering insights into the variables' impacts on low or high NOx emissions.
ics 2024, 12, x FOR PEER REVIEW and therefore, colors are unrelated to whether something is worse or better.Based SHAP analysis, the temperature measured in the dust duct and SDR before p through the SDR device and the NOx emissions at the previous time-steps contrib the predictions.In the semi-dry reactor (SDR), there is a process where liquid is s into the exhaust gas to lower its temperature.Indeed, a noticeable difference in t between the SDR inlet temperature and the SDR outlet temperature can be observe suggests that the contributions of the semi-dry reactor inlet temperature and the du temperature, which are related to the temperature of the exhaust gas before p through the semi-dry reactor, may be linked to the actual NOx emissions.

Discussion and Conclusions
This study proposes a model for predicting NOx emissions suitable for the EA ferroalloy production sites.A Kalman-filter-based smoothing algorithm was used t noise the NOx emission data from the EAFs and construct the training data.The presented an interpretable model using variables collectable from EAFs at ferroalloy

Discussion and Conclusions
This study proposes a model for predicting NOx emissions suitable for the EAFs of ferroalloy production sites.A Kalman-filter-based smoothing algorithm was used to denoise the NOx emission data from the EAFs and construct the training data.The study presented an interpretable model using variables collectable from EAFs at ferroalloy production sites and was able to identify key influencing variables in prediction through the utilization of explainable AI.The NOx emission prediction model employs real-time data collected from the EAFs of the ferroalloy production workplace, thereby offering insights for practitioners aiming to establish a real-time prediction system with data collection and NOx prediction capability.With increasing environmental regulations, practitioners involved in related industries need to prepare for these changes, which can serve as a basis for proactive adaptation in ferroalloy production.
This study developed an interpretable model for predicting NOx emissions in EAFs by adopting LSTM and identified the variables with a significant impact on NOx emission predictions from the collected data through explainable AI methods.Owing to this research, it is possible to provide guidance for building a NOx prediction system in EAFs, and it hints at ways to reduce NOx emissions at ferroalloy production sites through NOx prediction.For practical applications, NOx prediction can be implemented in real-world settings, with potential expansion to both chimney and internal exhaust gas emissions.However, the key to effective NOx emission prediction lies in the ability to collect data.Real-time data transmission from manufacturing and exhaust gas facilities to systems capable of immediate data management and collection is essential.From the perspective of building an NOx emission prediction system, this study can be helpful in establishing a system for the prediction of EAFs at ferroalloy production sites, where a data collection system has not yet been implemented.This study outlines the collected data, key variables, and data collection locations, offering guidance for workplaces looking to initiate data collection and management for NOx prediction.Many EAFs in ferroalloy production face challenges in establishing a data collection system for real-time historical processes and observational data from chimneys.Moreover, identifying the specific data required for accurate real-time NOx emissions prediction from the collectable data is necessary to these facilities.Owing to the limited prior research on predicting NOx emissions from EAFs, it is necessary to identify data that can be collected and that are essential for the prediction of EAFs at ferroalloy production.
Regarding potential impacts, this research can assist ferroalloy plant operators planning to reduce NOx emissions.NOx prediction can significantly contribute to NOx reduction efforts, both pre-and post-management.For pre-management, by identifying key operating variables during the NOx prediction process, it is possible to apply them to the operating systems of production facilities, attempting to adjust variables for the reduction in NOx emissions.In this study, through SHAP analysis, the operational variables were determined when the NOx emission levels were high and low.However, the variables with high importance values were measurements, whereas the actual operational variables, such as the depth of the electrode bars and power usage, showed low importance.If future research develops a high-performance predictive model based on operational variables, it will be possible to identify combinations of operational variables to reduce NOx emissions using an interpretable method.In post-management techniques, NOx prediction can contribute to exhaust systems using selective catalytic reduction (SCR) facilities.Denitrification facilities (e.g., SCR) remove NOx emissions from exhaust gases through chemical reactions, and the rate of NOx removal varies depending on the amount of ammonia used as a reducing agent.The excessive injection of ammonia can cause ammonia slip, leading to potential equipment failure and reduced dust collection efficiency, whereas too little ammonia reduces the NOx reduction.Therefore, a system that can adjust the amount of ammonia injection by predicting NOx emissions in real time is required.Despite these contributions, further studies are required.First, the study could be applied to various EAF environments as the types, variables, and specifications of EAFs

Figure 1 .
Figure 1.Schematic of an electric arc furnace.

Figure 1 .
Figure 1.Schematic of an electric arc furnace.

Figure 2 .
Figure 2. Schematic of the NOx emission process from the electric arc furnace to the chimney

Figure 2 .
Figure 2. Schematic of the NOx emission process from the electric arc furnace to the chimney.

Figure 4 .
Figure 4.The denoised results of the NOx data using the Kalman filter-based smoothing

Figure 4 .
Figure 4.The denoised results of the NOx data using the Kalman filter-based smoothing algorithm.

Figure 5 .
Figure 5. Schematic of the proposed NOx emissions prediction model.

Figure 6 .
Figure 6.Comparison of predicted and actual NOx emission.

Figure 7 .
Figure 7. Scatter plots of the prediction model on the test set.

Figure 8 .
Figure 8.The absolute SHAP value of each variable for time-wise steps.
). Summary plots allow us to stand the global trend of the SHAP values of a feature.Specifically, the summar show the distribution of SHAP values for each feature.Each point represents the value of the feature for an individual prediction.Points moving to the right ind positive impact on the model output, whereas points to the left indicate a negative Red points represent "high" NOx emissions, whereas blue points represent "low emissions.The bar plots represent the importance of the features; their importa creases from top to bottom.Figure 8 depicts how different variables affect NOx

Figure 8 .
Figure 8.The absolute SHAP value of each variable for time-wise steps.

Figure 9 .
Figure 9. Summary and bar plots with low NOx emissions.

Figure 9 .
Figure 9. Summary and bar plots with low NOx emissions.

Figure 10 .
Figure 10.Summary and bar plots with high NOx emissions.

Figure 10 .
Figure 10.Summary and bar plots with high NOx emissions.

Figure 11 .
Figure 11.Comparison of dust duct temperature at different NOx emission levels.

Figure 11 .
Figure 11.Comparison of dust duct temperature at different NOx emission levels.

Table 1 .
Prior NOx prediction studies in facilities with combustion process.

Table 2 .
Highest MI according to the delay time of each feature.

Table 2 .
Highest MI according to the delay time of each feature.

Table 3 .
Impacts of incorporating temporal factors or previous time-step NOx emissions on the performance (bold: indicates the best model).

Table 3 .
Impacts of incorporating temporal factors or previous time-step NOx emissions on the performance (bold: indicates the best model).