1. Introduction
Through the last decades, the steel industry has been facing increasing challenges for improved productivity and quality with optimum cost and minimum impact on the environment. These requirements have been met by advances in production methods and process models [
1]. Temperature is one of the most critical variables to be controlled along the process route and a required input for almost any control model in steelmaking operations.
Steel is mostly produced at integrated facilities, starting from iron ore that is reduced in a blast furnace (BF) into a carbon-rich molten iron. This liquid iron, called hot metal, is then transferred to a basic-lined oxygen furnace (BOF) and transformed into liquid steel by blowing oxygen and making use of scrap and other additives. Each batch of liquid steel is often called ‘a heat’. A general overview of BF and BOF processes is provided by Ghosh [
2], and a detailed description of the BOF process can be found in Miller [
3].
Hot metal is usually transported from the BF to the BOF in a lined vessel, called torpedo wagon or, succinctly, torpedo. During transport and other torpedo operations, the hot metal undergoes a non-easily predictable temperature drop. However, the estimation of the final hot metal temperature is required to calculate the relative quantities of hot metal, scrap, and other raw materials to be loaded into the BOF [
4]. Given that these materials account for a significant part of steel cost and carbon footprint, an accurate forecasting of hot metal temperature becomes critical for an optimal BOF operation. Consequently, temperature control from BF to BOF has been receiving much attention. However, existing studies are usually focused on individual stages of the process, such as:
Prediction and control of hot metal temperature at BF tapping using mathematical models [
5,
6], measurements [
7], or a combination of both [
8,
9,
10].
Heat transfer modelling in torpedo car and its effects on the refractory lining [
11,
12], or on the molten iron temperature and metallurgy [
13,
14].
Optimum cycling of torpedo fleet, with thermal losses affecting the cost structure [
15,
16].
General models for predicting the temperature evolution along the complete hot metal route such as those existing for steel ladles are by far much less common [
17]. Although some mechanistic models were proposed [
18,
19,
20], this approach seems not reliable in real plant conditions, since many relevant phenomena are difficult to measure or even characterize.
In a previous work, the authors studied the feasibility of infrared thermometry and time series forecasting for accurate prediction of the hot metal temperature at the steel plant [
21]. A combination of both methods proved to be reliable and long-term stable, as well as self-adaptive to changing production scenarios. However, this research revealed the necessity to improve the regressive part of the model. The authors also explored the application of artificial neural network (ANN) to this problem but, although resulting accuracy was satisfactory, long-term stability of the model could not be guaranteed [
22].
The present work aims to obtain an advanced hot metal temperature forecast at the BOF. A model based on multivariate regression can take full advantage of the information provided by the exogenous predictors already available in the process databases.
Multivariate adaptive regression splines (MARS) stands out among other multivariate regression techniques. Since its introduction by Friedman [
23,
24] in the 90s it has been successfully applied to a variety of fields including life, finance, industry, business, energy, and environment. However, for the moment, the application of MARS to steelmaking processes remains limited to steel solidification by continuous casting and solid state transformations by hot and cold rolling processes [
25,
26,
27]. Despite the predictive possibilities of MARS technique, no application to BOF steelmaking has been published yet.
Temperature forecasting with MARS has been mainly focused on natural systems [
28,
29]. Only very recently, Krzemień [
30] has applied this technique to predict the temperature evolution in underground coal gasification. For this problem, the syngas chemical composition, flowrate, and temperature were used to predict the temperature of syngas 1 h ahead. The obtained absolute error was less than 15 °C in 95% of the cases for a temperature range of about 200 °C.
Some properties of MARS make it an interesting choice for this research [
31]. Firstly, it is more flexible than linear regression, allowing to model nonlinearities and interactions between variables. Additionally, input data are automatically partitioned, limiting the effect of the undetected outliers that can be expected from a large industrial dataset. Moreover, it automatically includes important variables in the model and excludes unimportant ones; this helps model stability even in changing process conditions. MARS is also suitable for handling fairly large training datasets at low computing cost and then provides very fast predictions. This characteristic is very valuable for a process control model that is going to be continuously trained and repeatedly executed. Finally, MARS provides interpretable models in which the effect of each predictor can be clearly interpreted. This is true not only for additive models but also when interactions between variables are present. This strength should not be underrated, since model interpretability strongly affects process improvement and industrial users’ satisfaction.
One drawback of the method is the non-differentiability of the model at discrete points. This limitation can be overcome by locally smoothing the resulting piecewise model in a post-processing phase or by using local higher order splines [
23,
32]. These variants can be interesting, for example, for global sensitivity purposes [
33]. However, usually local non-differentiability does not affect the prediction performance of the model.
The feasibility of MARS technique for hot metal temperature forecasting is investigated in the present work. The effect of model hyperparameters on predictions accuracy is assessed. Moreover, a novel approach for continuous model training is adopted to ensure long-term adaptive operation of the model [
34,
35]. The resulting improved temperature forecast, is used as an input for the BOF charge model and results in environmental, productivity, and cost improvements [
36,
37,
38].
4. Discussion
As indicated before, the application of the final MARS model {
S = 1,
I = 1,
MF = 21,
d = 2,
L = 4,
w = 2000} to the validation dataset provided 2195 evaluations of the method from
t = 10,001 to
t = 12,195. The resulting basis functions and coefficients in Equation (2) at
t = 11,755 are shown in
Table 3. The model at this point is taken as an example to discuss the features of the model. In this case, the model comprises 12 basis functions including the intercept term. Considering that 21 functions where allowed in the forward phase (
MF = 21), it is inferred that nine functions with the lowest contribution to GCV where pruned in the backward phase.
As can be deducted from
Table 3, the most important basis functions are those containing
x1,
x2, and
x4, that is, the initial hot metal temperature, the total holding time, and the empty torpedo time, respectively. Empty ladle time and pre-treatment time have less importance; in fact, the actual temperature of the previous heat
x6 =
yt−1, is a better predictor. Moreover, the MARS method automatically excludes the less important variables. For this particular heat (
t = 11,755), the actual temperature four heats before,
x9 =
yt−4, is not included in the model. This indicates that its contribution to model performance is negligible or even adverse. This is not the general case for every heat; for example, at
t = 10,656 a basis function is also included for
x9. In general, it is positive to include
x9 in the model to improve its performance, as indicated in
Figure 3c for curve
w = 2000 at
L = 4. It can be seen that some basis functions appear in pairs, as is the case for
x1,
x2, and
x6. Other basis functions are individual, indicating that the corresponding symmetric functions were removed in the backward phase. A basis function without its symmetric variant indicates that either the involved variable is relevant only above a particular value (this is the case of
x3,
x5,
x7, and
x8), or in the lower part of its range (as for
x4). As can be seen, the adaptive knot location of MARS method succeeds in capturing the nonlinearities of the data using segmented linear regression.
The model equations obtained for other heats have very similar shape with slight changes. The most important features are always present in the model. Other characteristics arise or go away depending on the history of the previous w heats.
The actual shape of the basis functions can be better understood combining
Table 3 and the graphical representation of the model in
Figure 6. The line plots located in the diagonal of the mosaic show the model output, that is, the predicted hot metal temperature,
, against each individual input variable,
xi. The rest of the input variables are kept at their mid-points (
xk = 0.5 for
k ≠
i).
The line plot of versus x1 indicates that the effect of the initial temperature tends to damp as it increases. This is a coherent result, since higher thermal losses are foreseen from higher initial temperatures in all the phases of the process. A similar reasoning can be applied to the empty torpedo time, x4, considering that the thermal losses are expected to be higher for shorter times, since the lining temperature is higher. The effect of holding time, x2, and empty ladle time, x5, was found to be almost linear within the considered ranges, and with the anticipated slope, as can be seen in the related plots.
As predictable, the higher the temperature of the previous heats, the higher the expected temperature for the actual heat. Moreover, this effect seems to be only relevant, or more relevant, for higher temperatures. Although it can be hardly perceived in
Figure 3 (but can be confirmed in
Table 3), the slope of the line plot for
versus
x6 is smaller for
x6 < 0.64. This behavior is more evident for
x7 and
x8 which are relevant only above 0.58 and 0.68, respectively.
The response surface for all the pair-wise combinations of the input variables are represented in the contour plots located above the diagonal in
Figure 6. Again, the input variables outside the considered pair are kept at the mid-point of their ranges (
xtk = 0.5 for
k ≠
i and
k ≠
j). It can be seen that the adaptive knot location succeeds in representing the non-linear features of the multivariate process dataset.
Finally, the response surface around the actual value of the input at heat number
t = 11,755,
is shown in the elements below the diagonal in
Figure 6. Here, the color scale gives the predicted temperature (
= 0.056) and the circle marks the input vector.
The proposed model is dynamically trained for every new prediction to be made. This dynamic behavior is better appreciated in
Figure S1, an animated version of
Figure 6 showing model representation from
t = 10,004 to
t = 12,195.
Figure S1 is included in the
Supplementary Materials.
The good predictive performance of the proposed model can be better appreciated by comparison with other previously developed models as shown in
Figure 7. A common reference model is the moving average smoothing (MAS) which forecasts temperature as the average of the
w previous observations. The best MAE for this method is 0.0721 for
w = 5; higher
w values provide no benefit. However, for
w > 100, time-series auto-regressive integral moving average with eXogenous predictors (ARIMAX) models can be applied providing a MAE below 0.0690 for
w > 1000. On the other hand, for
w > 20, any MARS configuration provides a major improvement over simple reference models but also over the more advanced ARIMAX models. The initial base configuration for MARS {
S = 1,
I = 1,
MF = 21,
d = 2,
L = 0,
w = 1000} gives a MAE of 0.0518 giving a 25% error reduction with reference to ARIMAX. The best performing MARS {
S = 1,
I = 1,
MF = 21,
d = 2,
L = 4,
w = 2000} provides 0.0506 of MAE, representing a 27% and a 30% of error reduction with regard to ARIMAX and MAS, respectively. This model configuration is the new benchmark for this problem.
As demonstrated in the previous section, the introduction of lagged terms as additional predictors further reduces the MAE of the model. However, it is worth noting that the higher the number of input variables, the higher the
w required to obtain an optimum performance of the model, as can be seen in
Figure 3d and
Figure 7. The base configuration of MARS {
S = 1,
I = 1,
MF = 21,
d = 2,
L = 0} requires at least ten previous observations of the five input variables, as illustrated in
Figure 7. Configurations with additional input variables would require more than one hundred previous observations of all the inputs. This limitation poses a potential problem when some registers are missing in the process database. However, a judicious implementation of the model should relieve this problem, as indicated by the shaded region in
Figure 7. This region delimits the lowest MAE that can be achieved by applying the best configuration for the available data-window at execution time.