Predicting the Electrical Energy Consumption of Electric Arc Furnaces Using Statistical Modeling

: Statistical modeling, also known as machine learning, has gained increased attention in part due to the Industry 4.0 development. However, a review of the statistical models within the scope of steel processes has not previously been conducted. This paper reviews available statistical models in the literature predicting the Electrical Energy (EE) consumption of the Electric Arc Furnace (EAF). The aim was to structure published data and to bring clarity to the subject in light of challenges and considerations that are imposed by statistical models. These include data complexity and data treatment, model validation and error reporting, choice of input variables, and model transparency with respect to process metallurgy. A majority of the models are never tested on future heats, which essentially renders the models useless in a practical industrial setting. In addition, nonlinear models outperform linear models but lack transparency with regards to which input variables are inﬂuencing the EE consumption prediction. Some input variables that heavily inﬂuence the EE consumption are rarely used in the models. The scrap composition and additive materials are two such examples. These observed shortcomings have to be correctly addressed in future research applying statistical modeling on steel processes. Lastly, the paper provides three key recommendations for future research applying statistical modeling on steel processes.


Introduction
The Electric Arc Furnace (EAF) is the second most common process in steelmaking and accounted for 28% of the total world production of steel, on average, between 2008 and 2017 [1].The cost of raw materials and Electrical Energy (EE) can account for 80%, or more, of the total cost per metric ton produced steel.It is therefore important to improve both current operational strategies and possibly to invent novel strategies to reduce the energy and raw material consumption.
One approach to improve or invent new operational strategies in the EAF is through the use of mathematical modeling.The idea is to create an accurate representation of the process whereupon it is used to simulate the effects of new operational strategies and practices.The advantage of representing the system as a model is that it reduces the amount of resources needed, i.e., the effects can be studied without interrupting regular production or investing in new equipment.A model can also aid the process operators by presenting key values, which are then acted upon as a guide towards a more optimized process.One commonly used framework for modeling the EAF is Computational Fluid Dynamics (CFD).Here, the system under study is represented by a meshing grid with boundary conditions.Physical equations are applied and the effects can then be studied after the simulation is calculated.A thorough review of CFD models for the EAF process was presented by Odenthal et al. [2].Another commonly used modeling framework is by applying linear-or nonlinear programming with physicochemical-and process-based constraints to the mass-and energy balance equations of the EAF [3][4][5][6][7][8][9][10][11][12][13][14][15][16][17][18][19][20][21].The approach solves the mass and energy balance equations for every time step from beginning to the end of the process.Moving-time horizon models represent another type of models that have been used to model the EAF [22][23][24][25].
To the authors' knowledge, a review of available statistical models in the context of the steel process modeling has not previously been conducted.Statistical modeling differs from other types of mathematical modeling approaches in that they do not directly act upon measurements or physical equations.Instead, statistical models presents the outcome with the highest probability, which is a weighted decision based on previous examples known to the model.These differences introduces some practical challenges and considerations that has to be addressed if one aims to use a statistical model in a practical setting within the steel plant, let alone compare statistical models predicting the same outcome type.These challenges and considerations are addressed in this paper by reviewing already available statistical models that predict the EE consumption of the EAF.Hence, the aim of this paper is to structure and to bring clarity to the subject of statistical modeling in the context of predicting the EE consumption in the EAF in light of these challenges and considerations.It is shown that a majority of the models are never tested on future heats which essentially renders the models useless in a practical steel plant setting.Nonlinear models are shown to outperform linear models but lack transparency with regards to which input variables are influencing the EE consumption prediction.Some input variables that heavily influence the EE consumption are rarely used in the models.
Even though this paper only reviews available statistical models predicting the EAF EE consumption, the insights gained herein can be used as guidance for similar studies attempting to utilize statistical models in the context of steel production.This is concretized in three concluding recommendations to future research at the end of this paper.

Energy Dynamics
The energy balance of the EAF process makes it possible to express EE consumption, E El , as a sum of ingoing and outgoing energy factors (see Equations ( 1)-( 3)).
E tot,in − E tot,out = 0 Each of the energy factors are related to physical and chemical entities (see Table 1).However, some of the important physical entities in the EAF process are difficult to accurately measure in practice.For example, the steel temperature measurement depends on the location of the measurement probe.A probe close to the arcs will give an overestimated temperature measurement while a probe close to a solid piece of scrap will underestimate the steel temperature.The energy dynamics within the furnace are dependent on factors such as raw material charged, steel types, furnace design, production delays, and production strategies, among others.Sankey diagrams are commonly used to present estimates for each of the ingoing and outgoing energy terms.Numerous such diagrams are presented in the literature, where the EAF energy balance is studied based on measurements from the process [11,[26][27][28][29]. Using these reported data, Table 2 has been computed.The computed percentages are illuminating on how much each energy term can be expected to contribute to the energy dynamics of the EAF process.This can in turn be used to motivate the selection of input variables to a statistical model predicting the EE consumption in the EAF.The computed table shows that the most important ingoing and outgoing energy factors are electric and liquid steel, respectively.

Nonlinearity
It is observed in Table 1 that some of the factors governing the energy balance are not linearly dependent on all of its physical components.Specifically, E Gas and E Rad are not linearly dependent on the temperature.Furthermore, the different EAF heat cycle sub-stages, such as charging, melting, and refining, affect the intensity of power loss through cooling water, radiation, and convection.For example, when the electrodes are covered in scrap, the energy loss by radiation and cooling water is lower than during the later stages of melting.These factors further governs the nonlinearity of the energy loss.
Dust lost per unit of time, ṁDust , is not linear because the amount of generated dust is dependent on heat cycle sub-stages [30], as well as on types of charged scrap.For example, scrap with a high zinc content, such as galvanized steel, generates more dust due to the low solubility of zinc in molten steel and higher vapor pressure compared to iron [31].The volume of gas flow per unit time, VGas , is not linear, due to varying volume of gas flow from oxygen lances, burners, and the off gas system exhaust during the various stages of the EAF process.
Nonlinearity also comes from irregularities due to unexpected events in the EAF as well as events taking place in upstream-and downstream processes.Experience shows that these delays are not consistent across heats.The resulting distribution of the total delay could be estimated using a sum of Gamma distributions, as seen in Equation (4).
where n is the total number of unique delay types.The point here is not to specify the distribution, per se, but rather to further illustrate factors influencing the nonlinearity of the EAF process.Furthermore, the delays have different impacts on the process depending on if they occur during charging, melting, refining, at the end of the process, or during the preparation stage.A longer delay in the preparation stage will not have as severe an impact on the energy losses as the same delay length when parts of the scrap have been melted.Nevertheless, any delay will always increase the tap-to-tap time, t TTT , which makes all the energy factors that are proportional to t TTT essentially nonlinear (see Table 1).

Multivariate Linear Regression (MLR)-mean furnace values:
The reported research started somewhere back in the 1980s, but it is the research done by Köhle and associates, in the 1990s that constitutes the most significant start to the field.In the first European Coal and Steel Community (ECSC) report of the research, the goal was to gain a better understanding of the reasons behind a better economic operating procedure of the electrical power system in the EAF [32].The focus was mainly on the EE consumption and on the electrode consumption.A formula for EE consumption was created using MLR on average production values from 14 Alternating Current (AC) furnaces with tap weights between 64 and 147 tons.The resulting formula was published in a paper as a part of a conference in 1992 [33].The formula used values from total charged scrap, alloys, fluxes, end point steel temperature, power-on time, and consumed burner natural gas and oxygen injection, all of which were taken by suggestion from the BFI research institute in Germany.The resulting formula agreed well with the actual average EE consumption values, ranging 380-600 kWh/t, with a standard deviation of error of only 5 kWh/t.This was despite not including the effects of important parameters such as slag foaming, bottom blowing, or injected and charged carbon.In the second paper by Köhle [34], released the same year, a fictitious model was created using the average production values as well as some specific values from the 14 furnaces used previously [33].This was done to investigate achievable consumption figures using the MLR model, which resulted in a fictitious decrease in electricity consumption by 150 kWh/t.
In the work by Bowman [35], average values from 11 AC furnaces and 11 Direct Current (DC) furnaces were used to investigate the accuracy of Köhle's model [33].It was found that a better fit could be achieved for the 22 furnaces by adjusting the constant term from 300 to 313 kWh/t.The mean error values from DC and AC furnaces were 0 and 2 kWh/t, respectively.The standard deviation of error increased from 5 kWh/t to 13.3 kWh/t and 11.2 kWh/t, respectively, as reported by Bowman [35].The increase in standard deviations are not surprising since the original model was fit on data from 14 different AC furnaces.
An additional term for oxygen used for post-combustion was incorporated into Köhle's model by studying seven AC furnaces with and without post-combustion technology [36].Some production data were not available and were taken from literature, by questionnaires, or using assumptions.However, the assumed values had very little effect on the determination of the coefficient for the post-combustion oxygen term.Using known theoretical values of the combustion of carbon monoxide and hydrogen gas, the heat transfer efficiencies, in percent, were calculated for both chemical reactions.
Köhle's formula, with the added post-combustion oxygen term [36], was extended by a variable for continuous and discontinuous operation using data from 35 furnaces [37].However, it was not specified if this variable represented continuously charged materials or if it corresponded to intermittent charging.The furnaces were located in Europe, Japan, America, Africa, and Australia.The furnaces studied were of varying types such as single shell, twin shell, double shaft, and single shaft furnaces, thus extending the validation of the model.However, the standard deviation of error was increased to 40 kWh/t, a substantial increase compared to the results from previous investigations (5.1-17.4kWh/t) [33,35,36].This is not surprising since these furnace designs have different operating strategies, which consequently makes the process data different.Developments of production Key Performance Indicators (KPI), the statistical model, scrap-preheating, furnace productivity, refractory-and graphite consumption, was discussed in [37].
Up until now, Köhle's models [33,36,37], and further investigations [34,35], have only considered average values from each furnace.Average values makes the predictions more accurate since it contracts predictions on all heats and the error into one single data point, masking the errors from individual heats.Hence, the R 2 and error values from these models are overwhelmingly better than most other types of models.

MLR-regular:
The developed Köhle model, with the variable for continuous and discontinuous operation [37], was extended by using 5453 heats from five furnaces of varying designs.Energy loss measurements and the fraction of shredded scrap to total tap weight was included in the new model with the aim to predict single heats accurately, while at the same time keeping the accuracy of previous predictions on mean furnace values at similar rates.The variable for continuous and discontinuous operation was removed and some of the coefficients in the previous model were adjusted [38].The model still performed well with respect to the average of 54 furnaces in the previous studies [33,36,37].This is not surprising since the majority of the terms of the developed formula are very similar to the earlier formulas derived only by using mean values of those furnaces.However, varying performances were found for the single heat predictions of the 5 new furnaces.Further details on the model extension can be found in an ECSC report [39].Kleimt et al. used the Köhle model, with energy terms [38], together with permanent off-gas analysis to create a dynamic mass and energy balance model adapted to the specific furnace under study [40].The coefficients from the EE demand formula was converted and put to use into the energy and mass balance model, which was a result from a comprehensive study of the energy and mass balance of the specific furnace.However, the performance of the combined model was reported as a fraction of total energy and not as EE demand.At worst, the standard deviation of error was 31 kWh/t and the mean error was −41 kWh/t when only using Köhle's MLR model.
It is clear from these studies that one model cannot accurately predict heats for multiple furnaces of varying designs and productivity requirements.Therefore, some studies from 2002 and onwards aimed at developing their own models by referencing the developments in Köhle's research [32][33][34][36][37][38][39].
In another study, the Köhle model, with energy loss terms [38], was reviewed and data from several new EAFs was applied to the model [41].The weak performances motivated the need to either adjust the coefficients or to create a new MLR model adapted to the data from the new EAFs.The adjustment of coefficients did not yield satisfactory results, which progressed the development towards a new MLR model.The model included, for instance, important variables such as the average electrical power in the electrode system and power-on time as well as some of the variables used in the latest Köhle model [38].However, some questions must be raised regarding the practical applicability and prediction of such a model.This is due to that the input variables include both the power-on time and the average electrical power of the electrode system.The average power multiplied by the power on time equals the EE consumption.Hence, it is not surprising that the new model achieved an R 2 value of 0.96 where a value of 1.0 represents a perfect model.
A MLR model was created based on data from an EAF producing steel using only Direct Reduced Iron (DRI) as raw material [42].In this model, KPIs related to DRI such as metallization, carbon content, and gangue content were included.A time variable was also included to account for delays during the process.Furthermore, the power on time, power off time, tapping temperature, and lance oxygen were also accounted for.However, 48% of the original data were removed and were not part of the development of the models.This raises some concerns about the practical applicability of the model.The authors also studied the signs of the coefficients for each variable to investigate the effect on the electricity consumption [42].Discussions were made with emphasis on previous plant experiences and common metallurgical theory.
Czapla et al. [43] used genetic algorithms, along with reported linear models for EE demand by Köhle [33,37,38], to find optimal values from each equation for the furnace of study.However, the results were merely theoretical in the sense of presenting the minimum and maximum energy consumption from each equation.Further validations were not commenced.

Partial Least Squares (PLS) regression:
A study focused on studying the effects of scrap created one EE demand model for each of two Swedish furnaces in two steel mills producing bearing steels and long products, respectively [26,[44][45][46].The uniqueness of the approach was the use of Partial Least Squares (PLS), a linear statistical model, for predicting the EE demand.This study was also the only one that used two statistical models per furnace to predict the EE demand per ton produced steel.One model was used to predict the EE while the other model was used to predict the yield of scrap to liquid steel.Using the predicted EE and the predicted yield from each model, respectively, the EE demand per ton produced steel was calculated.The results showed that the prediction models were not reliable which was, according to the authors, probably related to variations in the hot heel weight as well as the absence of continuous process data for gas flow, temperatures, and the electrical power system.The overall significance of scrap types on the EE demand were found to be low, but of high variety between different scrap grades [46].

Artificial Neural Network (ANN):
Beginning with an extended linear model, based on Köhle's research, and an ANN model using energy variables, Baumert et al. [47] advanced the modeling by creating an interconnected ANN.Each ANN was specialized in predicting the EE consumption after several specific time intervals.The model used continuously logged energy variables to predict the EE consumption for 10 consecutive time intervals, where the first four networks were predicting from the charging of the first basket and the last six networks were predicting after the second basket were charged.To account for charging delays of the second basket, the fifth ANN also used a time-delay variable.The standard deviations of error for the interconnected ANN model were very large, but converged for the last two ANN.However, the upper and lower prediction errors for the end-point ANN were observed to be ±6.3MWh/heat.To counteract this error, a damping factor was developed by studying response graphs between the predicted EE and the prediction error.While not being developed or based on metallurgical or process-specific considerations, the damping factor reduced the upper and lower error bounds to ±2.5 MWh/heat, respectively.Further details of this approach were presented in an ECSC report where supplementary data such as scrap type and weights were included as input variables in some improved models [48].Equations showing the relations between the online measurements of the process and the energy variables are presented in another ECSC report [49].Some of the models created in these studies were the only ones reported in the literature which were tested on single future heats from the same furnace that the models were trained on.Hence, the resulting errors are more in line with what could be expected from the model in a practical steel plant setting.
The research by Gajic et al. [50] led to the creation of a ANN model predicting the EE consumption based on only the %C, %Cr, %Ni, %Si, and %Fe contents of the input scrap.The model was one of the few found in the literature tested on external data.However, as the data for training, test, and validation of the ANN model were gathered from experimental melts, there remains some questions about the practical utility of the model during regular production.
Random Forest (RF): MLR, RF, and ANN models were created using scrap types, preheating energy, oxygen, and natural gas oxy-fuel burners as input variables [51].The R 2 of the models varied within 0.31-0.62 with ANN as the best model and MLR as the worst model.However, a R 2 value of 0.62 without using time variables, such as tap-to-tap time, can be considered as a high value.Due to the absence of interpretable machine learning algorithms in the study, the effects of scrap on the EE consumption was only analyzed for by using the MLR model.The EE consumption per ton liquid steel varied between 386 kWh to 559 kWh depending on the scrap type.

Deep Neural Network (DNN), Decision Tree (DT), and Support Vector Machine (SVM):
A DNN model was trained using variables for scrap types, burner oxygen, injected oxygen, carbon, dolomite, and lime, as well as tap-to-tap and power-on times [52].The error values from the model were compared with the error values from three other models created using MLR, SVM, and DT.The DNN model was found to outperform the other three statistical modeling approaches with respect to R 2 , mean absolute error (MAE), and maximum absolute error performance metrics.However, the maximum absolute error of 17.4 MWh/heat raises some question about the practical applicability of the DNN model.
An overview of all mentioned studies in this section are categorized in Table 3.

Input Variables
A summary of the input variables used in the models found in the literature are shown in Table 4.A total of 27 unique models are present in the literature.Common variables include the total time when the arcs are powered on, the total time when arcs are powered off, and the tap-to-tap time.Usually, tap-to-tap time is regarded as the sum of the power-on and power-off time.Less common is the specific used service time, which was added in [41] likely to account for cooling of the furnace during maintenance.The impact of time variables for each of the sub-processes, charging, melting, refining, and tapping, have not been explored by the studies.
The EE consumption and power-on time are generally regarded as being strongly correlated.This is not surprising, since the increase in power-on time naturally increases the EE consumption.As many steel plants aims to maximize the EE input during power on, the correlation becomes even more profound.Due to the strong causal relation between the power-on time and EE, and because of the ad-hoc assumption of the power-on value, one could argue against the benefit of having a model that predicts EE using power-on as one of the input variables.If one chooses to use the power-on time as an input variable in a real steel plant setting, the value has to be assumed at first, and then incrementally updated each time the true power-on time exceeds the assumed power-on time.This approach has to be done for all input variables that are not known at the start of the process.
Using delay variables in the model could more accurately account for nonlinearity of energy losses, as presented in Section 2. Delays are rarely considered by statistical models found in the literature.The first occurrence of delay as a variable in the models accounted for the delay when charging basket No. 2 [47][48][49].The second occurrence used the total delay time in the MLR model, which resulted in a model coefficient of 0.001.The other coefficients ranged from −18 to 1303 [42].The relatively small coefficient could be because the EAF process is regular with few delays or due to that the delays are not consistent, and thus not recognized by the MLR model.The irregularity of a variable is easier considered by nonlinear models, such as ANN, compared to linear models.

Chemical
The net contribution from oxidation accounts for 20-50% of the total ingoing energy (see Table 2).The heat contribution depends on, among other things, the amounts of ingoing oxygen gas; charged additives such as silicon, carbon, and lime; and the scrap composition.These elements set the limit of how much generated chemical energy can be expected.Further adding oxygen gas will oxidize iron, above a certain level, which can increase the EE demand per ton produced steel due a decrease in metal yield.From industrial practice it is known that the reduction in EE consumption is inversely proportional to the amount of added oxygen, i.e., the net reduction in EE decreases with increasing specific oxygen input.
Burners are used in the process to reduce temperature gradients in the EAF and to facilitate an even melting.They are also used to decrease reliance on EE and can contribute up to 11% of the total ingoing energy (see Table 2).
All but three studies use variables for burners and oxygen lancing.Additives are used by models in four studies, even though they can have considerable effects on the heat generated by chemical reactions [26,41,[44][45][46]51,52]. On the other hand, scrap composition is only used in one study [50].However, added scrap types in the process could be used as a proxy measure for the scrap composition.Of course, this is only sensible if the scrap composition for each type is expected to stay the same.Scrap compositions typically change with time due to supply and demand dynamics in the scrap market.It is also dependent on whether the steel plant classifies scrap types by shape, composition, or a combination thereof.

Temperatures
The target temperature is set prior to the start of the process and acts as guidance for the EAF operators.The target temperature of the steel linearly increases the energy requirement for the steel melt.One should note that the target temperature may not always be the only guidance for determining the tapping time of the furnace.This is especially true in the case when the melt is projected to remain in the ladle for an additional time due to delays in downstream processes.More energy is then added to the heat to achieve a higher temperature and to reduce the risk of solidifying the melt, thus giving the downstream processes a better starting temperature.
Temperature measurements are commenced at the end of the heat to provide the operators with information on remaining energy requirements before tapping.The accuracy depends on the measurement strategy, but are prone to errors due to that temperature gradients from the electrodes to the furnace wall or if solid scrap is present in the liquid bath.Furthermore, the time of measurement is dependent on other factors such as previous delays in the heat.It is therefore doubtful if the temperature measurement can assist a statistical model in accurately predicting the EE demand.
In the Ladle Furnace (LF), or ladle treatment-station, the arrival temperature is used as an estimation of the required final steel temperature in the EAF.However, the practical usage of this variable for statistical modeling of the end-point EE demand is limited because the ladle furnace temperature is taken after the steel has been tapped from the EAF.

Materials
The energy required to heat and melt the metallic scrap accounts for 45-60% of the total outgoing energy, which makes the amount of charged scrap an important input variable.However, the charge mix, if expressed in weight units per scrap type, can be used as a proxy variable to the ingoing metal weight.An increased slag weight means that more energy is required for the same amount of steel, because more heat is tied up in the slag.Slag, together with dust, can be expected to account for 4-10% of the total outgoing energy.
The charge mix affects the melting dynamics in the furnace as well as the metal yield.The impact of scrap on the EAF process yield and EE demand has been investigated previously [46,48,51].However, only qualitative effects from the scrap were presented in the first article [46].In the second article, response graphs were plotted for each scrap type and basket as a function of the EE demand [48].In the third article, quantitative effects by the scrap, as indicated by the MLR model coefficients, on the EE demand per ton of input scrap was compared to experience-based values.A model for yield was not created in this article [51].
For EAF utilizing hot metal has to account for a weight estimate of this ingoing material.Not only does hot metal facilitate faster melting times but it does also change the energy dynamics in the heat.For example, one ton of hot metal containing 95 wt% Fe, 4 wt% C, and 1 wt% Si adds 509 kWh of energy to the heat, assuming that all silicon gets oxidized and 0.2 wt% C remains at the end of the process.For enthalpy data, Knacke et al. was used [53].For chemical reaction data, Gaskell was used [54].However, the hot metal is only accounted for in Köhle's models [37][38][39], and in some further developments by Czapla et al. [43].It is discussed in the thesis by Sandberg but not used in the models [26].
The steel tap weight is required to transform MWh/heat to kWh/t as an output variable.Most models in the literature reporting in kWh/t use the kWh/t as output from the model, while the models by Sandberg calculate the yield in a separate model [26].It is important to note that the output in MWh/heat only requires the EE consumption, while an output given as kWh/t also requires the tap weight.The tap weight is only known after the heat is produced, which means that assumptions about the tap weight have to be made if the statistical model is used in production.This naturally introduces uncertainties, since the steel tap weight is dependent on factors such as total amount of dust generated and the amount of slag generated by oxidation.These parameters are, in turn, affected by the ingoing charge mix, scrap composition, additives, oxygen injection, and process times.

Other
Heats since cold start is used to account for increased energy requirements during a cold start.However, this energy loss per heat is expected to decrease significantly after the first heat, given that the time between tapping and charging is not unusually large.The variable can also be used as a rough approximation of the furnace lining wear.A thinner furnace wall increases the energy loss through conductive heat transfer.This variable is only used in the models by Sandberg et al. [26,[44][45][46].
Energy values are often prone to errors because they cannot be measured directly.It is often based on both a measured physical property, such as temperature, and a physical model that relates the temperature to an energy value.The logged EE are considered to be the truthful because it is well defined in the EAF transformer system.A less reliable energy variable is the chemical energy caused by oxidation, because reactions governing the energy are occurring at different rates and are impacted by the chemical compositions in both the steel melt and the slag.Furthermore, chemical energy is also lost in the off-gas system, which partly contributes to between 11% and 35% of outgoing energy (see Table 2).
There are a number of different ways to preheat scrap.However, the most common way is to use latent energy in the off-gas from the EAF process.Usually only using a part of the total off-gas volume.The pre-heater does not affect the EAF process other than increasing the enthalpy of the ingoing material.Furthermore, all other energy variables in the literature are derived during the main EAF program.Hence, we separate this energy variable from all other energy variables.The preheating energy requirement is based on the ingoing and outgoing temperatures of the scrap as well as the energy absorbed by moisture and energy released by oil and grease.The challenge with accurate preheating energies are that the assumed contents of oil, grease, and moisture are not always correct.The contents are based on sampled measurements for the specific type of scrap and re-calibrations are done infrequently.This means that the preheating energy can be either overor underestimated.Therefore, the preheating energy mainly contributes to the white noise of the underlying data.Only three studies in the literature utilize the preheating energy as an input variable in their models [47][48][49]51].Some of the furnaces studied by Köhle used scrap pre-heating but the variable was not taken into account [37][38][39].
Furnace-specific variables are used to account for effects that are governed by the furnace design.One model by Köhle used a variable named CON, which takes either of the values −1 or 1, to account for the effect of continuous operations [37].Köhle's latest model, which accounts for energy loss differences from heat to the average of heats utilized a furnace specific factor, NV, taking values between 0.2 and 0.4 [38,39].The NV value is multiplied to the difference between the mean energy losses of all heats and the energy loss of the current heat.However, the NV-value was not derived based on process or metallurgical knowledge, but based on analysis of the model errors.The authors of the present article believe that furnace-specific variables should use proven process experience and physicochemical relations.
Continuous variables are the basis on which dynamic models are built.They usually consist of furnace pressures of oxygen, carbon monoxide, carbon dioxide, and nitrogen, as well as cooling water temperatures, incremental input of electricity, oxygen, and burner fuel.Secondary variables from a proprietary energy and mass balance model can also be present in the database.To the best of the authors' knowledge, only the research undertakings by Baumert et al. have created dynamic statistical models predicting the EE demand [47][48][49].

Modeling Procedure
The modeling procedures governing the statistical models are presented Table 5, and divided into data complexity, model validation, data treatment, and model transparency, each of which is described further below.The number of heats used when creating and evaluating models represents a rough measure how much experience is encoded into the model.For most steel mills, the number of heats created per year are often less than 10, 000, which sets the boundaries for a model selection.A general rule of thumb is that a model with few coefficients requires fewer data points to optimally converge than a model consisting of a larger number of coefficients.An explicit example is MLR, with number of coefficients equal to p + 1, where p is the number of input variables.On the other end, the number of coefficients in an ANN scales exponentially with the number of hidden layers, K L .Here, K is the number of nodes in the hidden layers and L is the number of hidden layers.Due to the sheer limitation of available data from any particular steel mill, one should insist on using shallow ANN instead of DNN, if choosing to commit to a commonly used nonlinear model.Chen et al. [52] used DNN, including four hidden layers with 500 nodes in each, and obtained very satisfying results with regards to R 2 values.However, the model was never tested on future heats and its max error was surprisingly large (17.4MWh), both of which decrease its practical applicability.
The practice of variable selection should always be to select input variables that have, by experience and by established science, the most impact on the output variable.This should be executed with the aim of keeping the number of input variables low relative to the number of data points.Increasing the number of variables also increases the likelihood of using variables that are strongly collinear.Using collinear variables also adds redundancy to the model and violates the principle of parsimony, which states that the simpler of two equally good models should be used.Collinearity between variables has not been analyzed by the studies, even though some of the presented models have well over 100 input variables (see Table 5).
The relation between number of data points and number of input variables is, in the field of statistical modeling, known as the curse of dimensionality [55].Given that the number of input variables increases for a given model, the number of data points needs to increase to cover all possible combinations of values.This is easily conceptualized by a p-dimensional hypercube, where p is the number of input variables.As the p-value increases, the number of data points needs to increase exponentially to cover all possible outcomes of the input variables, i.e., fill the hypercube.However, in reality, covering the whole p-dimensional space with samples are not relevant because some combinations of values will not be valid from a domain perspective.For example, an EAF where the EE demand ranges 30-45 MWh and the charged scrap amount ranges 80-100 t.The combinations of values below 30 MWh and above 45 MWh, and below 80 t and above 100 t charged scrap are not relevant.Nevertheless, the curse of dimensionality is a concept that should be considered when creating any statistical model.

Model Validation
The split of data between the training and test phases is important for two main reasons.First, the number training heats gives an indication of the amount of experience retained by the model.Second, the number of test heats indicates how statistically significant the test results are.Ideally, one should use as many test heats as possible but the optimal training/test split should be selected with the above two concepts in mind.Only three studies explicitly state the training/test split and use test data on at least one of their models [47][48][49][50].This limits the comparability between the models that have used out-of-sample testing, due to the uncertainty in the number of test heats.
Out-of-sample testing is defined as heats that are not part of the training data set used to create the model.Many out-of-sample tests have been done using Köhle's models on heats from other furnaces than those used to create the models [35,40,41,47].Arguably, this approach is not feasible since the data emitted from any EAF system is closely connected to the specific EAF design, production strategies, and delay patterns in the steel plant.One should therefore not expect a model created using data from EAF A to perform well on data from EAF B. Out-of-sample testing is not coherent in the reported models in the literature, as less than half of the reported performance results are out of sample tests for both linear and nonlinear approaches.This means that the results from the models that have not been tested on out-of-sample heats cannot be used as an indication of the practical utility of the model in a potential steel plant setting.
The test heats should be from future production heats relative to the training heats if the model will be used to predict on heats from the same furnace.Should the model be used in production, it will only predict future heats.Arguably, a model can be considered useless if it has not yet been validated on data from future heats.Hence, validating the model only on historical heats is useless from a practical applicability standpoint.Only two studies explicitly state that the validation data are from future heats for at least one of the reported models [47][48][49].This further limits the number of models whose performance are eligible for comparison as well as evaluation for practical usefulness.

Data Treatment
Treating data with statistical means comes with very small considerations to the specific application.The treatment can be divided into cleaning (removing data) or repairing (replacing missing data).Many statistical cleaning algorithms are created only with outlier detection in mind.Careful consideration to the distribution of the variable should be done before applying statistical cleaning frameworks, because some frameworks only work well on certain types of distributions.Only two studies report statistical data treatment.However, it is not specified which type of algorithm was used [50,51].Domain data treatment, as opposed to statistical data treatment, aims to rid, or repair, the dataset from corrupt data or data that are not part of regular operations.Corrupt data, in the context of this article, are data points that satisfy one or more of the following criteria.
1. Practically impossible within the upper and lower bounds of the furnace operation 2. Physically or chemically impossible 3. Unlikely from a process standpoint 4. Erroneously logged in the system Non-regular operations include, but are not limited to: 1. Trial heats for calibrating energy consumption for new scrap 2. Heats involved in longer maintenance stops 3. Heats with unusually long delays or tap-to-tap times Only five studies indicate that this type of treatment was used as part of the data cleaning process [26,32,33,42,[44][45][46][47]51].However, specific descriptions of the cleaning strategies are left out in two of these studies [26,[44][45][46][47].
Explicitly stating the number of cleaned data points in each cleaning step is not only critical to validate the practical usability of the model, but also important when comparing its performance to other models.Data cleaning methods are impacting the performance of the model because data points are removed that would otherwise be part of the model training.In the statistical cleaning strategy, these data points are often outliers, and it can be easily understood that the outliers have a large impact on the standard deviation of error and min/max errors of the model if they are included.Vagueness with regards to data treatment heuristics and the number of data points cleaned introduces uncertainties with regards to the practical operating span of the model.
It is seldom reported in the literature how many data points that were cleaned by each type, statistical and domain data cleaning.Only three studies explicitly state the number of data points that were cleaned for at least one of the models [32,33,38,39,47], and a total of two studies gives the exact percentage of cleaned data points of the total number of data points [32,33,42].Furthermore, one can argue the practical applicability of a model that is trained only on 52% of the available dataset [42].The model will only be able to predict on roughly half of the future heats given that the future heats come from the same distribution.This severely limits the practical utility of the model.

Model Transparency
Transparent models are models that reveal to what extent each input variable affects the prediction.For the EAF models predicting the EE demand, it is of particular interest from a metallurgical domain perspective to ascertain that the model is weighting the input variables in line with established process experiences and metallurgical knowledge.Only when this is verified can the model be regarded as usable in practice, given that its performance is satisfactory.
MLR models are straightforwardly transparent, because their coefficients and variables combined can be explicitly written out as a mathematical equation.However, not all of the MLR models presented in the literature are explicitly written out as equations.This removes the transparency of the model, since the influence of the input variables cannot be verified.
Inspecting the coefficients enables metallurgists to verify the models connection to metallurgical theories and process experiences.The authors of [56] investigated the coefficients in Köhle's models, K1, K3, and K4, on values derived from data partly based on experimental data from other sources.Discrepancies were found for most of the coefficients, although the magnitudes agreed well with the derived values from external sources.
Most nonlinear model frameworks, such as artificial neural networks (ANN) models, are black-box models and therefore not transparent.Interpretable machine learning algorithms can be used on nonlinear models to better explain the predictions.Feature importance shows the importance of each variable relative to the other variables as an aggregate measure over all predictions [57].Shapley Additive Explanations (SHAP) reveals the contribution from each input variable on the output variable for each specific prediction [58].
Interpretable machine learning algorithms have never been used on nonlinear models reported in the literature within the scope of this review.Response graphs were used in one study to investigate the error span of an ANN model along intervals of predicted values as well as the effect of burner energy on the predicted EE consumption [47].However, response graphs only reveal the responses from one input variable to the output variable and does not show the quantitative contribution of the input variable to the output variable.

Model Performance
Arguably, one should use the adjusted-R 2 formula if the number of predictors is large, because each added predictor slightly increases the R 2 value given that the number of data points is fixed [59].The formula can be written as follows where R 2 is the standard R-square, n is the number of data points, and p is the number of input variables.
However, an adjusted-R 2 value has not been reported in the literature, which introduces uncertainties when comparing models with different number of variables.
The authors of [38,39] reported the correlation coefficient instead of the R 2 value.However, they did not specify which type of correlation formula that was used.Hence, we assume that Pearson's correlation was used.It is defined as follows where σ X and σ Y are the standard deviations of X and Y, respectively.
The correlation coefficient, ρ X,Y , measures the linear relationship between two variables while the coefficient of determination, R 2 , measures the explained variance of a model by comparing the true values with the predicted values.The metrics can therefore not be used interchangeably.
The error metrics for mean values reported in the literature underpinning this study are regular, absolute, and Root Mean Squared Error (RMSE).Considering y i as the true value and ŷi as the predicted value, and i ∈ 1, 2, . . ., n the metrics are defined as follows Regular: Absolute: RMSE: Representing errors from single data points are done using regular or absolute error metrics.The regular error metric should be used above the absolute error metric, since underestimated predictions are different from overestimated predictions in a practical context.An underestimated EE prediction means that too little EE is estimated for the heat while an overestimated predictions means that too much energy is estimated for the heat.Hence, the absolute metric will not point to the true effects of the error as the regular error metric will for single data points.
As shown in Table 6, only three studies report error metric types on all or some of the models [47,50,52].This brings about uncertainties when comparing the mean, standard deviation, and min and max errors of the models.
The unit kWh/t enables unbiased comparisons between furnaces since it expresses the consumed energy per ton produced molten steel.Measuring the error as MWh/heat introduces a vagueness into the evaluation if the yield cannot be accurately estimated.A rough assumption can be made by dividing the value with the stated furnace capacity.This has been done for the models in the studies by Baumert et al., which reported all errors in MWh/heat [47][48][49], for the sake of comparison with other models that have been tested on future heats or on heats from other furnaces.
The reported spans of the performance metrics for linear and nonlinear models that have been tested on future single heats from the same furnace are shown in Table 7. Performances on heats from new furnaces using Köhle's model, K4, was also included due to its frequent usage in literature.The reported results are sparse and the only performance metrics that can be compared are the mean and standard deviations.The mean error of nonlinear models are far superior compared to linear models.The worst standard deviation of error of the nonlinear models are close to the best standard deviations of the linear models.This provides further evidence that nonlinear models are preferable over linear models for predicting the EE demand of an EAF.
Only one of the batches of heats from another furnace applied to Köhle's model, with energy loss, has reported R 2 values.Lack of R 2 values leads us to doubt the practical utility of the models, because the other performance metrics do not indicate the goodness of fit.

Furnace Types
Information about the furnace types and furnace Key Performance Indicators (KPI) governing the data for the models in the literature are shown in Table 8.Due to the wide variety of EAF designs, practices, raw materials, auxiliaries, etc, some challenges are introduced when comparing different models with regards to both performance and choice of input variables.For example, the amount of burner gas will be more important as an input variable for a furnace where the amount of exothermic energy from the burner amounts to a larger proportion of the total energy consumption compared to a furnace where the burner gas is sparsely used.This also means that the input variables have to appropriately reflect the dynamics of the furnace design and operating strategies.Due to the lack of information regarding operating strategies and designs for most of the furnaces, it is not possible to accurately verify the choice of input variables governing the models.See Table 4 for the input variables, and Table 8 for the furnace types.
In general, the furnace information is more complete for the studies using linear models as opposed to those using nonlinear models.All post-2010 papers lack furnace KPI such as tap to tap times, capacity, and maximum power output.This could indicate some recent "decoupling" between the fields of statistical modeling and process metallurgy.

Conclusions
The aim of this review was to structure and to bring clarity to existing statistical models predicting the EE consumption of an EAF in light of challenges and considerations that are imposed by statistical models.The review was mainly divided into either linear or nonlinear model types, and has compiled input variables, modeling procedures, model performances, and furnace types.The main conclusions of the review may be summarized as follows:

•
Fourteen out of 15 studies have used linear models and about 8 out of 15 studies have used nonlinear models.Of the linear models, MLR is the most common model, and, of the nonlinear models, ANN is the most commonly used model.

•
Out of a total 27 reported models in the literature, 13 are linear and 14 are nonlinear.

Input Variables
• Twelve out of the 27 reported models uses the power on time as input variable, despite its close connection to the EE consumption of the heat.

•
The use of delay times as input variables are only used in three studies [42,[47][48][49], despite its importance to account for irregularities in the EAF and in upstream/downstream processes, all of which increases the time to produce a given heat.Increased production times naturally increases energy losses by radiation, water cooling, and convection.

•
Variables representing chemical energy, such as burner and oxygen lancing, are frequently used.Additives such as lime, dolomite, and carbon, and the scrap composition are seldom used, even though they account for 20-50% of the total energy requirement.

Modeling Procedure
• All reported nonlinear models lack transparencies regarding which input variables have the largest influence on the EE consumption.Even though the input variables are specified, it is impossible to conclude whether the model is using the input variables reasonably in line with established science.However, interpretable machine learning algorithms such as feature importance and SHAP can be used to make nonlinear models more transparent.

•
Complete specificity in data treatment is lacking for all but one study [32,33].Information about the data treatment is either completely omitted, unspecified with respect to statistical or domain-specific heuristics, or vaguely presented with respect to the number of cleaned data points.This limits the practical usage because the instances where the model is valid are unknown.

•
A model must be evaluated on future heats for practical applicability.This is because a model used in the industrial process will used to predict performances of future heats.Only two studies have evaluated models on future heats from the same EAF governing the training data used to create the models [47][48][49].The possibility to sensibly compare the specific model performance with the performances of other models is thus severely limited.

Model Performance
• Reporting on model performance is not consistent.Some studies use MWh/heat instead of kWh/t liquid steel.Only two studies present all model performance metrics: mean errors, standard deviation errors, min/max errors, and R 2 values [36,37].This makes model comparisons and evaluations difficult.

•
Comparisons cannot be made between models that have not reported their performances on either future single heats from the same furnace or single heats from another furnace.This is because any model will always be biased towards the data it is trained on.Seven out of 27 models reported in the literature reach these requirements [40,[47][48][49].None of these models report all performance metrics mean, standard deviation, min/max errors, and R 2 values.

•
For the models that have been validated on future single heats from the same EAF, or single heats from another EAF, it was found that the nonlinear model type outperformed the linear model type.This indicates that nonlinear models are favorable over linear models when creating statistical models predicting the EE consumption.Furthermore, these results agree well with the inherent nonlinearity of the EAF process.

•
The type of furnaces studied varies significantly with respect to EAF design, capacity, power output, tap to tap time, and steel type.No clear coherency can be found and all post-2010 papers lack important information on the EAF design governing the data.Without this information, it is difficult to compare models between furnaces or verify the chosen input variables using metallurgical process expertise.
As concluding remarks, based on the findings in the current study, three recommendations are given for future attempts in statistical modeling applied in steel industry contexts: 1. Proficiency in both metallurgy and statistical modeling is crucial if one aims to create a statistical model that is relevant in practice.Previous process knowledge is necessary to obtain meaningful process models.Processes and models cannot be independently developed.2. The statistical model must be able to predict the performance of future heats with an accuracy that satisfies the requirements on the process as specified by the process engineers.Any reported model performance must therefore be on predictions on future data relative to the training data.3. The models also have to be robust.The implications are twofold: First, the effects of the input variables should be in line with what is expected from physics and chemistry.The effects of the input variables on the output variable for nonlinear statistical models can be analyzed by interpretable machine learning algorithms.Second, the models have to be robust to values outside their scope of training.This means that the data cleaning algorithms used on the training data must be applied to new data before being fed into the model.Total energy lost in electrical system and arc transfer T CS

Nomenclature
The temperature of the cooling panels T EAF The temperature of the surface area subject to radiation losses T s Temperature of ingoing material and gas at the start of the EAF process T Tap Temperature of the steel at tapping T O f f gas Temperature of the off-gas leaving the EAF through the off-gas system T H2O Temperature of the cooling water T H The temperature of the surface area subject to convection losses T Amb The temperature of the air surrounding the EAF m Steel

Mass of ingoing metallic material m Slag
Mass of ingoing oxidic material ṁDust Mass flow of dust in the off-gas system c Steel The heat capacity of steel at constant pressure c Slag The heat capacity of slag at constant pressure c Dust The heat capacity of dust at constant pressure c Gas The heat capacity of EAF ambient gas at constant pressure c p (reactants) The heat capacity of reactants at constant pressure c p (products) The

Table 1 .
Ingoing and outgoing energy terms governing the energy balance equation of the Electric Arc Furnace (EAF) process.
p (products) − ∑ c p (reactants) dT E Chem ∝ T s ; T Tap ; composition E BuTotal energy input from burnerE Bu = η Fuel h Fuel V Fuel E Bu ∝ V Fuel ; η Fuel
Conv = hA H (T H − T Amb ) • t TTT E Conv ∝ T H ; T Amb ; t TTT E El,loss Energy lost in electrical system and arc transferE El,loss = (1 − η Arc )(1 − η El ) • E El E El,loss ∝ t PON

Table 3 .
Studies where new statistical models were created or where new data were applied on previously reported models.Each study can contain references to multiple articles if the results or details about the model(s) are spread over multiple sources.Some studies have used both linear and nonlinear models.

Table 4 .
Input variables used in each study creating new statistical models.The models are labeled in alphabetical order with respect to the study where they are created.Köhles' models are labeled K1-K4 for distinction purposes because they are the only models that are used in multiple studies.* Defined as time from first power on to start of tapping.** Average Power (MW).*** Total oxygen input (m 3 ), + Main and secondary oxygen, ++ Number of heats since a longer production stop.+++ Except preheating energy.⊕ Indirect input variable due to being part of the energy variable calculations.# From another statistical model.## Not specified which energy variables were used.

Table 5 .
[48]ling procedures for models reported in the literature.Each entry represents a new model or new heats applied to a pre-existing model.Blank entries means either unclear information or that the modeling step was not commenced.Specifications are marked and explained further.*Average of each furnace.**Monthlyaverage.+Calculatedfrom reference.++Complementarydetails from[48].⊕ Not applicable due to applying new data to already trained model.

Table 7 .
[48,49]ance metric value spans for linear and nonlinear models that have been tested on future single heats from the same furnace.The models were A, B, C, D, E from the first study by Baumert et al.[47], and D from the second study by Baumert et al.[48,49].See Table6.Köhle's model with energy loss, K4, is the only model where single heats from other furnaces has been applied.It was included in a separate column due to its frequent usage in literature.* Only one reported value.

Table 8 .
Furnace information for all model results as shown in Table6.Each label represents a new model or heats tested on another model.
heat capacity of products at constant pressure ∆H o 298 (reactants) Standard heat of formation for reactants at 298K ∆H o 298 (products) Standard heat of formation for products at 298K ∆H Melt,steel Heat of fusion for steel ∆H Melt,slag Heat of fusion for slag k Conductivity of the cooling panels h Heat transfer coefficient of the EAF ambient gas Emissivity factor of the radiating surface area of the EAF True value of the output variable for data point i ŷi Predicted value of the output variable for data point i