2.1. Modelling Approach
One of the first widely known empirical models for prediction of the electric energy demand of EAFs was developed by Köhle et al. in the 1990s by statistical analysis of average production values from 14 furnaces. The Köhle model was later improved and extended to post-combustion and alternative ferrous material such as hot briquetted (HBI) or direct reduced iron (DRI) using 5000 single heats from 5 different furnaces. An updated formula for the specific electric energy demand
published in 2005 is given in
Table 1 and Equation (1) [
11]. In contrast to later models, the coefficients are not only fitted by linear regression, but in most cases also correspond to values found in thermodynamic analysis of arc furnace process [
12,
13].
While the results given by the Köhle model are in good agreement with the average electric energy demand of the furnaces, results from single heats can significantly differ, as will be shown in the results section. The formula is, however, still used for benchmarking of the operation of arc furnaces [
14,
15]. The Köhle model was also specified for an almost 100% DRI EAF operation at Mittal Steel Lázaro Cárdenas [
16]. In order to predict the energy demand of single heats more reliably, a number of models based on more complex algorithms have been developed in the last decade [
4,
17,
18,
19,
20]. In this study, three different kinds of regression models will be utilized for prediction of the electric energy demand of the EAF. However, proper adjustment of the applied models is a wide area of research and description of all possible settings is beyond the scope of this paper. Therefore, in the following the basics of the applied methods are described in short, and reasoning is given concerning the choice of hyperparameters.
For regression, the measured data is first standardized by subtracting the mean value of every predictor and dividing by its standard deviation. Calculation of the so-called z-score is shown in Equation (2). In using standardization variables with varying scales, different units of measurement are brought to the same scale and can contribute equally to the result. This might also increase training speed of the models. On the contrary, standardization gives equal weight to data with comparatively small variance and may thus excessively incorporate noise into the calculation. Furthermore, information on the mean and standard deviation of the explanatory variable is lost.
For optimization of the model parameters, the mean square error (MSE) between the total demand of electric energy and the model prediction is minimized. Calculation of the MSE is shown in Equation (3). The measured electrical energy demand is named
. The calculated value of the electrical energy demand is labeled
with the number of data points denoted as
. In other works [
18,
19], the specific electric energy demand per ton of produced steel is used for analysis. In the context of this study, application of the models for the prediction of future heats shall be investigated. The mass of tapped steel is, however, unknown prior to tapping. Hence, tuning of the model parameters is performed using the absolute demand of electric energy for this study.
Although Köhle performs a nonlinear transformation on some variables, for example by dividing by the mass of tapped steel, the base model remains a multiple linear regression (MLR). Furthermore, Köhle did only use data available for all furnaces and abstained from standardization of the data. MLR is one of the earliest and most basic methods for supervised learning, which is mapping of input to an output based on a set of training examples. In MLR, the predicted response is calculated by linear combination of the explanatory variables as stated in Equation (4). In doing so, it is assumed that the relationship between true response and explanatory variables is linear and explanatory variables are not correlated. In the case of the arc furnace, both assumptions are, however, violated. Thermal radiation is increasing with the fourth power of the melt’s temperature, and energy loss through cooling of the furnace therefore increases at later stages of the process when the melts temperature is higher, and the furnace walls are not shielded by scrap. Other mechanisms such as slag foaming can further impact the overall energy demand nonlinearly [
21]. Nevertheless, due to their simplicity and low computational demand, MLRs are still commonly used. Within this work, MLR will be used as a benchmark for the nonlinear model types.
Limitations of linear models, such as their inability to account for interactions between the input variables, gave rise to the popularity of ANNs for estimation of the electric energy demand of EAFs [
17,
18,
19]. A network in which information only moves forward through the layers without feedback is called feedforward network or multilayer perceptron. These networks are the quintessential deep learning models [
22]. The structure of a simple feedforward network with only one hidden layer is displayed in
Figure 1. At each neuron, the values from the previous layer are multiplied with a set of weights and a bias is added. The resulting value is transferred to the subsequent layer through application of an activation function. In the past, tangens hyperbolicus or the logistic function were frequently used as activation function. However, these sigmoid activation functions are only strongly sensitive when the input is close to 0, for high or low values the function quickly saturates, affecting gradient-based learning [
22]. In modern applications of ANN’s, linear units are often utilized. For prediction of the energy demand an exponential linear unit with
was chosen. The output of this threshold operation is given by Equation (5)
Within this work the neural
network applied for estimation of the energy demand contains 2 hidden layers
featuring n and neurons respectively, where n is the number of explanatory variables. During training of the model, the weights and biases of the ANN are tuned by minimizing the loss function (MSE) and backpropagation of the error to each neuron in each layer. For optimization, a stochastic gradient descend with momentum was used with an initial learn rate of 0.01.
Another approach to supervised learning is through application of Gaussian processes. A Gaussian Process is defined as a collection of random variables, every finite collection of which have a multivariate normal distribution. It is a generalization of the Gaussian distribution over functions with a continuous domain and is fully specified by a mean
and covariance function
as stated in Equation (6) [
23]. In consequence, the Gaussian process is a nonparametric model. Rather than calculating parameters such that a given class of functions (e.g., linear functions) fits the data, the prior distribution contains all functions defined by the chosen mean and covariance function.
By incorporating the observation from the training data, functions which do not pass the data points (or do not closely pass in case of noisy data) are removed from the infinite set, in order to form the posterior distribution. As a result, the posterior uncertainty in the vicinity of the observations is reduced. This is also called conditioning of the Gaussian prior distribution on the observations. In
Figure 2a three samples from the prior distribution are shown. The posterior distribution after observation of five data points is depicted in
Figure 2b. The underlying (unknown) function is a polynomial of the third degree. Making a prediction using the Gaussian process ultimately amounts to drawing samples from its posterior distribution.
That being said, the predictive performance of Gaussian processes depends exclusively on the chosen kernel [
24]. For prediction of the electric energy demand the Matérn covariance function with
given in Equation (7) was chosen. In contrast to other popular kernels, such as the infinitely differentiable squared exponential kernel displayed in Equation (8) (Gaussian function), its shape is rather rough. However, strong smoothness is argued to be unrealistic for modelling of physical processes [
25].
The performance of the models on the validation data is evaluated using the adjusted coefficient of determination
, as well as the mean absolute error (MAE), standard deviation of the result (SD) and relative standard deviation (RSD). These values are calculated as shown in Equations (9)–(12). The relative standard deviation (coefficient of variation) is utilized in order to illustrate the extend of variability in relation to the average demand of electric energy [
26]. The mean values of the measured and calculated electric energy demand is denoted as
and
, respectively. The coefficient of determination ranges from 0 to 1 and is often used as an indicator for the goodness of the fit, with 1 meaning the results perfectly match the measurements.
2.2. Datasets of EAF Heats Used in This Study
For evaluation of the described models, process data of five electrical arc furnaces for industrial steel production was used. In total, the data sets contain material consumption and furnace operation data of roughly 21,000 heats. However, the investigated furnaces differ considerably regarding their capacity and material input, as well as the measurements taken during operation. The characteristics of the furnaces are summarized in
Table 2. With an average tap weight of about 80 t EAF-A has a notably smaller capacity than the remaining furnaces. Likewise, the average tap-to-tap time of EAF-A is shorter. Furnace B, C and D have similar capacities and tap-to-tap times. The highest specific electrical energy demand is found for EAF-C. The different specific electric energy demand of the furnaces can in part be attributed to the differences in the charged ferrous material. For both EAF-B and EAF-C, the input material contains large quantities of DRI or HBI while the remaining furnaces use scrap of varying quality.
Not all documented heats can be used for evaluation of the electric energy demand. In the first step data, treatment is performed on each set. The overall goal is the removal of faulty or irregular data originating from erroneous data logging or irregular operation such as trial heats, aborted heats, or equipment malfunctions. Including these heats would otherwise have a negative impact on training of the models for regular heats, which are the main subject of the investigation.
Table 2 shows the total amount and the percentage of excluded heats. In the following the applied decision rules for removal of data are described.
When crucial data like electric energy demand or tap weight are missing, the applied regression models cannot accurately predict the electrical energy demand, and therefore the heats in question must be excluded from consideration. In addition, heats are excluded if the measurements are unreasonable. This is, for example, the case if the tap weight exceeds the maximum capacity of the furnace, or the recorded tap-to-tap time is lower than the power on time of the heat. Significant outliers were also removed from the data sets. These include heats with an abnormal tap-to-tap time since they are likely to contain long power off times as a result of production delays by unscheduled events or regular maintenance stops. Likewise, heats are removed if the number of buckets differs from the rest of the batches. Finally, heats are removed if their ratio between charged ferrous material and tapped steel is below 0.75 or above 1.05, respectively. This is due to the mass of the hot heel not being measured for most of the furnaces. In keeping part of the molten steel inside the furnace after tapping, the melting rate of the subsequent heat can be increased. This results in lower thermal losses and a lower overall energy demand [
27]. Moreover, in DC furnaces a hot heel is necessary for operation as it is covering the electrode in the bottom of the vessel and closes the circuit. However, when the furnace is completely emptied the amount of energy needed for initial melting of the hot heel is unaccounted for, while the energy demand of the next heat is higher compared to regular heats. In consequence, these heats, usually occurring before and after maintenance periods, are removed from consideration. In total, between 3% and 13% of the heats were removed. EAF-C, EAF-D and EAF-E show a notable larger percentage of the excluded data compared to EAF-A and EAF-B. This can mostly be attributed to unusually long heats, i.e., frequent production interruptions (270 for EAF-C and 160 for EAF-E) and missing measurements or recorded data. For EAF-D 775 out of 785 removed heats are missing the mass of charged material and about 270 heats from EAF-C are lacking temperature measurement from the molten steel.
Apart from the data quality and overall differences in the operation of the furnaces, the amount of data recorded during operation also differs significantly.
Table 3 shows an overview of the available measurements. The electric energy demand, as well as the mass of charged ferrous material, coal and slag formers are measured for all furnaces. For the first three furnaces only a basic breakdown into scrap, DRI and alloying metals is given while EAF-D and EAF-E have a detailed record of the charged scrap grades. The exact chemical composition of the input materials is, however, unknown and is likely to differ between plants and even between heats. Furthermore, the mass of charged slag former in EAF-B and EAF-C is provided with an accuracy of 0.5 tons. This suggests that the stated mass is estimated or measured with limited accuracy only. Although this was the only obvious case it must be noted that all measurements are associated with a degree of uncertainty since no information on the methods and accuracy of the measurements was given.
During operation, the injected oxygen and carbon mass was measured along with the consumption of natural gas. EAF-B is the only furnace without operation of natural gas burners. Moreover, in the records from EAF-D and EAF-E, oxygen input is further separated into different applications within the furnace such as oxygen for burners, lances, or post-combustion. In the remaining data sets oxygen input is separated only for the purpose of post-combustion with all other flows combined into a single measurement. Power-on time as well as tap-to-tap time are measured for all furnaces, yet only EAF-A and EAF-E have a detailed breakdown of sub-process times such as charging, melting, and tapping provided. Energy losses can vary considerably throughout the different stages of the melting process. Therefore, by providing information on the length of the sub-processes the quality of the prediction can be improved. Furthermore, the weight and temperature of the tapped steel are available. Temperature measurement is carried out shortly before tapping in order to ensure the target temperature was reached. The temperature and mass of the tapped steel are directly related to the energy demand for melting. As stated before, the mass of the hot heel which remains inside the vessel after tapping is however only measured at two of the furnaces with the method and accuracy of the measurement unknown. Lastly, at EAF-D the composition of both steel and slag is analyzed after each heat, while EAF-E has steel composition measured in regular intervals for 98 heats in total. As can be seen from the overview of furnace characteristics in
Table 2 and measurements in
Table 3 a single regression model cannot be applied for all furnace without the need to drastically reduce the data sets in order to form a common denominator. Even in doing so, the measurements are performed with different precision and in the case of scrap grades and slag formers classification is not necessarily uniform. In consequence the furnaces must be considered separately, while the general design of the investigated models is maintained.
For each furnace, heats are divided into a training and validation set. The training set contains 70% of the data and is drawn at random. Subsequent validation is performed on the remaining 30% of heats. All applied models are trained on the same data set. However, since selection of the training data influences the model accuracy, training and validation are performed on 5 separate training samples and the median results of the regressions are discussed. Beyond that, process data can be divided into two groups: measurements available before and only after the heat is finished. In the literature the entire dataset is often used for modeling of the electric energy demand [
11,
17,
20,
28]. While those models yield better results in terms of accuracy, they cannot be applied to predict the energy demand of a future heat. Subsequently, the investigated regression models will be used on the entire and limited dataset and the results of both approaches will be compared.