Temperature Prediction Model for a Regenerative Aluminum Smelting Furnace by a Just-in-Time Learning-Based Triple-Weighted Regularized Extreme Learning Machine

Chen, Xingyu; Dai, Jiayang; Luo, Yasong

doi:10.3390/pr10101972

Open AccessArticle

Temperature Prediction Model for a Regenerative Aluminum Smelting Furnace by a Just-in-Time Learning-Based Triple-Weighted Regularized Extreme Learning Machine

by

Xingyu Chen

,

Jiayang Dai

^*

and

Yasong Luo

Guangxi Key Laboratory of Intelligent Control and Maintenance of Power Equipment, School of Electrical Engineering, Guangxi University, Nanning 530004, China

^*

Author to whom correspondence should be addressed.

Processes 2022, 10(10), 1972; https://doi.org/10.3390/pr10101972

Submission received: 1 September 2022 / Revised: 20 September 2022 / Accepted: 23 September 2022 / Published: 30 September 2022

(This article belongs to the Special Issue Modeling, Optimization and Design Method of Metal Manufacturing Processes)

Download

Browse Figures

Versions Notes

Abstract

:

In a regenerative aluminum smelting furnace, real-time liquid aluminum temperature measurements are essential for process control. However, it is often very expensive to achieve accurate temperature measurements. To address this issue, a just-in-time learning-based triple-weighted regularized extreme learning machine (JITL-TWRELM) soft sensor modeling method is proposed for liquid aluminum temperature prediction. In this method, a weighted JITL method (WJITL) is adopted for updating the online local models to deal with the process time-varying problem. Moreover, a regularized extreme learning machine model considering both the sample similarities and the variable correlations was established as the local modeling method. The effectiveness of the proposed method is demonstrated in an industrial aluminum smelting process. The results show that the proposed method can meet the requirements of prediction accuracy of the regenerative aluminum smelting furnace.

Keywords:

temperature prediction; weighted regularized extreme learning machine; just-in-time learning; sample similarities; variable correlations

1. Introduction

Aluminum can be made into alloys with various metals; it is widely used in automotive, aviation, and military industries due to its good ductility, plasticity, recyclability, and oxidation resistance. A regenerative aluminum smelting furnace is important for the aluminum smelting process, in which the real-time measurement and control of liquid aluminum temperatures influence the quality of the aluminum. However, on industrial sites, there are many influencing factors, such as the aging of temperature-measuring thermocouples and fluctuations in the operating voltage, which bring difficulties to the real-time measurements of the aluminum liquid temperature. Hence, it is essential to develop a modeling method to predict the liquid aluminum temperature for quality improvement of the aluminum. The aluminum smelting process is a typical complex industrial furnace production process. In recent decades, many studies on industrial furnaces have been performed (regarding ‘mechanism modeling’) [1,2,3]. Although the physical meaning of ‘mechanism modeling’ is clear, there are some problems, such as complicated calculations for industrial furnace systems. At the same time, mechanism models may not be reliable enough since they usually make simplified assumptions. The furnace temperature, airflow rate, etc., fluctuate greatly in different working states due to the intermittent working characteristics of the regenerative aluminum smelting furnace. The real-time update of the model for the regenerative aluminum smelting furnace is also a problem that needs to be considered.

To overcome the shortcomings of mechanism modeling, a soft-sensor that makes full use of the industrial data is proposed [4]. There are many researchers working on the data-driven modeling of industrial furnaces and similar processes, such as partial least squares (PLS) [5], the kernel principal component regression (KPCR) [6], and kernel partial least squares (KPLS) [7], which have been successfully applied with good results. However, these methods are generally considered to be global modeling (and trained offline). Moreover, after these models are put into application, they will face problems, such as difficulties in model updating. Consequently, to deal with the adaptive update problem of the model, the moving window technique [8,9], recursive models [10,11], and the just-in-time learning (JITL) strategy [12,13] are usually used as online adaptive update strategies. The JITL strategy trains an online local model to predict the query samples by selecting similar samples from historical samples, so it is more suitable for processes such as industrial furnaces with state mutations. For example, Chen et al. [14] proposed a least squares support vector machine temperature prediction model based on JITL to deal with large temperature change lags in roller kilns. Dai et al. [15] combined the moving window technique and the JITL strategy as an update strategy to select similar samples in both time and space dimensions, and they verified the effectiveness of the proposed method on an industrial kiln. In [16], a locally weighted partial least squares regression (LWPLS) model was proposed by JITL-based local modeling. In LWPLS, the samples most similar to the query sample are assigned different weights and selected for local modeling. The current model will be discarded when the next query sample is available. Then, a new local PLS model will be established for the model’s online update. However, LWPLS only considers the sample similarities, not the variable correlations. The data of the aluminum smelting process often present high-dimensional characteristics and each input variable has a different degree of influence on the liquid aluminum temperature. Hence, except for the sample similarities, it is necessary to consider the variable correlations [17,18,19]. Furthermore, the accuracy of the JITL strategy depends on the quality of the selected samples. However, the traditional similarity measurement criteria, such as Euclidean distance and Mahalanobis distance, only consider the input information without considering the output information, and often cannot obtain accurate similar samples. Thus, investigating new similarity measurement criteria is important for the JITL strategy.

In recent years, artificial intelligence algorithms, such as long short-term memory networks (LSTM) [20,21,22] and extreme learning machine(s) (ELM) [23,24,25,26] have also been used in soft sensor modeling. The basic assumption for LSTM is that process data are sampled at even and unified frequencies; it is very difficult to meet these conditions for ‘process data measurements’ in industrial processes, especially for quality variables. Hence, LSTM is unsuitable for some processes with irregular sampling frequencies. ELM is a single hidden layer neural network with a low algorithm complexity, which does not need backpropagation to solve iteratively, and has been used in the temperature prediction of regenerative aluminum smelting processes. Huang et al. [27] proposed an extreme learning machine furnace temperature prediction model based on the kernel principal component analysis and showed that ELM has a better effect than the traditional BP neural network. Liu et al. [28] proposed an ELM model optimized by the restricted Boltzmann machine (RBM) to solve the random initialization of the input weights and biases in the ELM. Moreover, ELM has a fast learning speed and is suitable as an online prediction model. For example, Li et al. [29] built a local online ELM model in combination with a JITL strategy, allowing the online prediction of polyethylene terephthalate (PET) viscosity without relying on time-consuming laboratory analysis procedures. However, this ELM-based online prediction model neither considers sample similarities nor variable correlations, which is unreasonable in local modeling. Moreover, the original ELM runs the risk of model overfitting. Hence, a regularized extreme learning machine (RELM) [30] was proposed to solve the model’s overfitting problem.

Although some research studies have been carried out on ELM, there are few discussions about sample similarities and variable correlations in RELM, especially in temperature prediction. Based on the above discussions, a soft sensor modeling method of the JITL-based triple-weighted regularized extreme learning machine (JITL-TWRELM) was proposed to solve the above problems. Compared with the traditional data-driven modeling method described above, the method proposed in this paper not only allows real-time updating of the model but also obtains more accurate local modeling samples due to the use of the WJITL strategy, which uses correlation information between the input and output variables in the sample selection stage. Meanwhile, in the local modeling stage, the proposed method overcomes the shortcomings of the traditional local modeling method, which only considers the sample similarities and analyzes the variable correlations, highlighting the influences of different variables on the output. The remainder of this article is structured as follows. Firstly, the regenerative aluminum smelting furnace is briefly introduced. Secondly, the regularized extreme learning machine (RELM), sample weighted regularized extreme learning machine (SWRELM), and variable weighted regularized extreme learning machine (VWRELM) are introduced, respectively. Then, the JITL-based triple-weighted regularized extreme learning machine (JITL-TWRELM) is described. Next, the flexibility and effectiveness of the proposed method are validated in the industrial aluminum smelting processing. Finally, we present the conclusions.

2. Related Methods

Since ELM runs the risk of model overfitting, the regularization method is used to solve the overfitting problem. Considering sample similarities and variable correlations, the sample weighted regularized extreme learning machine (SWRELM) and the variable weighted extreme learning machine (VWRELM) are introduced, respectively. Three related methods are discussed next. To better understand the derivation of the relevant equations, the definition of symbols in this paper is shown in Table 1.

2.1. RELM

As shown in Figure 1, the structure of ELM consists of three parts, which are the input layer, hidden layer, and output layer [31]. The core idea of ELM is to randomly select the input weights and hidden layer biases of the network. The output weights between the hidden layer and output layer are obtained by minimizing the loss function and solving the Moore–Penrose generalized inverse operation. Owing to the particularity of the single hidden-layer structure, ELM has a faster learning speed, minimal human interference, and it is easier to implement than traditional networks. However, the original ELM model only considers the empirical risk minimization (ERM) principle, which tends to result in an overfitting model. To overcome this deficiency, a regularized extreme learning machine (RELM) was proposed based on empirical risk minimization and structural risk minimization (SRM) principles and has proven to be a better generalization performance than ELM.

It is assumed that the nth historical input variable vector and the output variable are denoted as

x_{n} = [x_{n 1}, x_{n 2}, \dots, x_{n m}]

and

t_{n}

, respectively, where m is the number of input variables.

(x_{n}, t_{n})

is the nth historical sample composed of

x_{n}

and

t_{n}

. The output function of the RELM with L hidden layer neurons can be represented as

\sum_{i = 1}^{L} β_{i} g (ω_{i} x_{j}^{T} + b_{i}) = t_{j}, j = 1, \dots, N

(1)

where

β_{i}

is the output weight of the ith hidden layer unit,

ω_{i} = [ω_{j 1}, \dots, ω_{j m}]

, and

b_{i}

are the input weight, bias connecting input layer, and ith hidden layer unit, respectively.

x_{j} = [x_{j 1}, x_{j 2}, \dots, x_{j m}]

is the input variable vector,

t_{j}

denotes the output corresponding to

x_{j}

, N is the number of training samples.

g (.)

is the activation function. Usually,

g (.)

is set as the sigmoid function. We re-write Equation (1) in matrix form

H β = T

(2)

where

H = {[h {(x_{1}^{T})}^{T}, \dots, h {(x_{N}^{T})}^{T}]}^{T} = (\begin{matrix} g (ω_{1} x_{1}^{T} + b_{1}) & \dots & g (ω_{L} x_{1}^{T} + b_{L}) \\ ⋮ & ⋱ & ⋮ \\ g (ω_{1} x_{N}^{T} + b_{1}) & \dots & g (ω_{L} x_{N}^{T} + b_{L}) \end{matrix})

(3)

β = {[β_{1}, \dots, β_{L}]}^{T}

(4)

T = {[t_{1}, \dots, t_{N}]}^{T}

(5)

Due to

ω_{i}

and

b_{i}

being randomly given, to obtain the output weight vector

β

, the optimization equation can be represented as

\begin{matrix} min \frac{1}{2} {∥β∥}^{2} + \frac{C}{2} {∥ξ∥}^{2}, \\ s . t . h (x_{j}^{T}) β = t_{j} + ξ_{j}, j = 1, \dots, N \end{matrix}

(6)

where C represents the regularization coefficient, which can adjust the empirical risk and structural risk.

ξ = {[ξ_{1}, \dots, ξ_{N}]}^{T}

is the training error vector. By constructing the Lagrange function, the solution of Equation (6) is

β = \{\begin{matrix} {(H^{T} H + \frac{I_{L}}{C})}^{- 1} H^{T} T, L < N \\ H^{T} {(H^{T} H + \frac{I_{N}}{C})}^{- 1} T, L > N \end{matrix}

(7)

where

I_{L} \in R^{L \times L}

,

I_{N} \in R^{N \times N}

.

2.2. SWRELM

Not all samples have the same contribution to the output; moreover, the original RELM considers all samples equally important and does not consider the differences between different samples. Thus, to obtain a more realistic result, the sample weighted matrix

Ω_{s} = d i a g (Ω_{s 1}, \dots, Ω_{s N})

is added to Equation (6), which is expressed as

\begin{matrix} min \frac{1}{2} {∥β^{S}∥}^{2} + \frac{C}{2} {∥Ω_{s} ξ∥}^{2}, \\ s . t . h (x_{j}^{T}) β^{S} = t_{j} + ξ_{j}, j = 1, \dots, N \end{matrix}

(8)

The Lagrange function can be represented as follows:

\begin{matrix} L (β^{S}, Ω_{s}, λ) \\ = \frac{1}{2} {∥β^{S}∥}^{2} + \frac{C}{2} {∥Ω_{s} ξ∥}^{2} - \sum_{j = 1}^{N} λ (\sum_{i = 1}^{L} h (x_{j}^{T}) β^{S} - t_{j} - ξ_{j}) \\ = \frac{1}{2} {∥β^{S}∥}^{2} + \frac{C}{2} {∥Ω_{s} ξ∥}^{2} - λ (H β^{S} - T - ξ) \end{matrix}

(9)

where

λ = [λ_{1}, \dots, λ_{N}]

denotes the Lagrange multiplier vector. According to the KKT condition, taking the derivative of Equation (9) and setting the derivative to zero, we have

\begin{matrix} \frac{\partial L}{\partial β^{S}} = 0 \to {(β^{S})}^{T} = λ H \end{matrix}

(10)

\begin{matrix} \frac{\partial L}{\partial ξ} = 0 \to C ξ^{T} Ω_{s}^{2} + λ = 0 \end{matrix}

(11)

\begin{matrix} \frac{\partial L}{\partial λ} = 0 \to H β^{S} = T + ξ \end{matrix}

(12)

With Equations (11) and (12), the Lagrange multiplier vector

λ

can be expressed as

λ = - C {(H β^{S} - T)}^{T} Ω_{s}^{2}

(13)

Similarly, with Equations (10) and (13), the expression of the output weight vector of the sample weighted regularized extreme learning machine (SWRELM) is

β^{S} = {(H^{T} Ω_{s}^{2} H + \frac{I_{L}}{C})}^{- 1} H^{T} Ω_{s}^{2} T, L < N

(14)

Equation (14) is suitable when the number of modeling samples is greater than the number of hidden neurons. Moreover, in this case,

β^{S}

has a faster calculation speed [32].

2.3. VWRELM

The original RELM treats all input variables with equal importance, while not all input variables have the same effect on the output variable, some input variables are more strongly correlated with the output variable than others. Thus, to reflect the differences of input variables and obtain better quality-related features, a variable contribution method based on the Pearson correlation coefficient was adopted. On this basis, the variable weighted extreme learning machine (VWRELM) was proposed. The Pearson correlation coefficient is defined as

ρ = \frac{E (x t) - E (x) E (t)}{\sqrt{E (x^{2}) - E^{2} (x)} \sqrt{E (t^{2}) - E^{2} (t)}}

(15)

where

E (x)

and

E (t)

are the expectations of the single input variable and output variable, respectively.

ρ

represents the degree of correlation between the two variables; two highly correlated variables will also have a larger

ρ

. As a result, the variable contribution can be defined by

ρ

. For a training sample

(x_{n}, t_{n}), n = 1, \dots, k

, where each input sample

x_{n}

has m dimensions, the contribution of each variable can be defined as

v_{i} = \frac{| ρ_{i} |}{\sum_{j = 1}^{m} | ρ_{i} |}, i = 1, \dots, m

(16)

where

ρ_{i}

represents the Pearson correlation coefficient between the ith input variable and the output variable. The variable contribution matrix can be written as

V = d i a g (v_{1}, \dots, v_{m})

(17)

Hence, taking the variable contribution as the variable weights, and applying the variable weights to the input sample

x_{n}

, the weighted input sample can be expressed as

x_{n}^{v} = x_{n} V = x_{n} d i a g (v_{1}, \dots, v_{m}) = (x_{n 1} v_{1}, \dots, x_{n m} v_{m})

(18)

where

x_{n}^{v}

represents the input sample weighted by variable weights. It can be seen from Equation (18) that each dimension of the input sample is given a different weight, reflecting the differences between variables. By variable weighting, Equation (3) can be rewritten as

\begin{matrix} H^{V} & = {[h {((x_{1}^{v})^{T})}^{T}, \dots, h {((x_{N}^{v})^{T})}^{T}]}^{T} \\ = (\begin{matrix} g (ω_{1} {({(x_{1}^{v})}^{T})}^{T} + b_{1}) & \dots & g (ω_{L} {({(x_{1}^{v})}^{T})}^{T} + b_{L}) \\ ⋮ & ⋱ & ⋮ \\ g (ω_{1} {({(x_{N}^{v})}^{T})}^{T} + b_{1}) & \dots & g (ω_{L} {({(x_{N}^{v})}^{T})}^{T} + b_{L}) \end{matrix}) \end{matrix}

(19)

when

L < N

, the output weight vector is

β^{V} = {({(H^{V})}^{T} H^{V} + \frac{I_{L}}{C})}^{- 1} {(H^{V})}^{T} T, L < N

(20)

3. The Proposed JITL-TWRELM Model

In the previous analysis, the RELM, SWRELM, and VWRELM models have been established. However, in a multi-data, multivariate prediction model, the different samples and variables to the predicted outputs are different, especially in the aluminum smelting process. Table 2 shows the shortcomings of the three methods. Both sample similarities and variable correlations should be taken into account in RELM. Hence, to obtain a better model, combined with the weighted JITL strategy (WJITL), a JITL-based triple-weighted regularized extreme learning machine is proposed.

3.1. Weighted Similarity Measurement Criterion

The original Euclidean distance is usually used as a similarity measurement criterion, expressed as

d_{o n} = \sqrt{(x_{q} - x_{n}) {(x_{q} - x_{n})}^{T}}

(21)

where

x_{q} \in R^{1 \times m}

is the current query sample,

x_{n} \in R^{1 \times m}

is the nth historical sample, and

d_{o n}

indicates the Euclidean distance between the current query sample and the nth historical sample. The more similar the historical sample is to the query sample, the smaller the distance

d_{o n}

. However, Equation (21) only uses the input information of the historical sample and query sample, while the information of the output is not taken into consideration. Moreover, the calculation of the Euclidean distance can be regarded as the accumulation of each dimension of the sample. It is easy to see that the importance of each dimension may be different, with some dimensions contributing more to distance than others. Hence, inspired by Equation (15), the connections between the input variables and output variables are established through the correlation analysis. We define a weighted Euclidean distance as a weighted similarity measure criterion, expressed as

d_{o n}^{w} = \sqrt{(x_{q} - x_{n}) Ω_{v} {(x_{q} - x_{n})}^{T}}

(22)

where

Ω_{v} = d i a g (ρ_{1}, \dots, ρ_{m})

. Then, the sample weight is expressed as

Ω_{s n} = exp (\frac{d_{o n}^{w}}{φ^{2}})

(23)

where

φ

is the adjust parameter, which can adjust the change rate of weight value with the sample distance. For a better expression, the JITL strategy that applied this weighted similarity measurement criterion is called WJITL.

3.2. JITL-TWRELM

A JITL-based triple-weighted regularized extreme learning machine (JITL-TWRELM) soft sensor method, combined with the WJITL strategy, was established to simultaneously incorporate sample weights and variable weights. The detailed derivation steps are as follows.

N (N < H)

samples

(x_{n}, t_{n}), n = 1, \dots, H

from historical samples were selected for each query sample to local modeling. First, Pearson correlation coefficients between input variables and output variables of all historical samples were calculated to obtain the correlation coefficient matrix

Ω_{v}^{g} = d i a g (ρ_{1}^{g}, \dots, ρ_{m}^{g})

(24)

To distinguish it from the subsequent derivation, we call

Ω_{v}^{g}

the global correlation coefficient matrix, where

ρ_{i}^{g}, i = 1, \dots, m

is the global correlation coefficient. As a result, the weighted Euclidean distance between the query samples and the historical samples can be obtained by Equation (25).

d_{o n}^{t w} = \sqrt{(x_{q} - x_{n}) Ω_{v}^{g} {(x_{q} - x_{n})}^{T}}, n = 1, \dots, H

(25)

We sort

d_{o n}^{t w}, n = 1, \dots, H

from small to large, and the first N samples are selected as modeling samples. The sample weighted matrix is obtained as

Ω_{s}^{t} = d i a g (exp (\frac{d_{o 1}^{t w}}{φ^{2}}), \dots, exp (\frac{d_{o N}^{t w}}{φ^{2}})) = d i a g (Ω_{s 1}^{t}, \dots, Ω_{s N}^{t})

(26)

Then, the Pearson correlation coefficient of N local modeling samples is calculated, and the local correlation coefficient matrix is obtained as

Ω_{v}^{l} = d i a g (ρ_{1}^{l}, \dots, ρ_{m}^{l})

(27)

where

Ω_{v}^{l}

is used as the local variable weighted matrix for local modeling samples and the query sample

X^{w} = X Ω_{v}^{l} = \{x_{n}^{w}\}, n = 1, \dots, N

(28)

x_{q}^{w} = x_{q} Ω_{v}^{l}

(29)

where

X \in R^{N \times m}

consists of local modeling samples,

X^{w}

and

x_{q}^{w}

are the variable weighted local modeling sample and variable weighted query sample, respectively. Thus, the new local modeling dataset

(x_{n}^{w}, t_{n}), n = 1, \dots, N

is used to build the local model. The optimization equation for the output weight vector is established as Equation (30)

\begin{matrix} min \frac{1}{2} {∥β^{t}∥}^{2} + \frac{C}{2} {∥Ω_{s}^{t} ξ∥}^{2}, \\ s . t . h ({(x_{_{j}}^{w})}^{T}) β^{t} = t_{j} + ξ_{j}, j = 1, \dots, N \end{matrix}

(30)

The output matrix of the hidden layer is

H^{t} = (\begin{matrix} g (ω_{1} {(x_{1}^{w})}^{T} + b_{1}) & \dots & g (ω_{L} {(x_{1}^{w})}^{T} + b_{L}) \\ ⋮ & ⋱ & ⋮ \\ g (ω_{1} {(x_{N}^{w})}^{T} + b_{1}) & \dots & g (ω_{L} {(x_{N}^{w})}^{T} + b_{L}) \end{matrix})

(31)

According to Equations (9)–(14), the output weight vector of JITL-TWRELM is

β^{t} = {({(H^{t})}^{T} {(Ω_{s}^{t})}^{2} H^{t} + \frac{I_{L}}{C})}^{- 1} {(H^{t})}^{T} {(Ω_{s}^{t})}^{2} T, L < N

(32)

Finally, the prediction output of the query sample is

\overset{\land}{t_{q}^{t}} = \sum_{i = 1}^{L} β^{t} g (ω_{i} {(x_{q}^{w})}^{T} + b_{i})

(33)

4. Industrial Case

4.1. Process Description of the Regenerative Aluminum Smelting Furnace

An industrial regenerative aluminum smelting furnace and its internal structure are shown in Figure 2a and Figure 2b, respectively. The regenerative aluminum smelting furnace consists of a furnace chamber, regenerative burner (including burner and ceramic sphere accumulator), reversing valve, flue gas pipe, etc. The regenerative burners are arranged in pairs, and the two opposite burners are a group (A and B). Normal temperature air from the blower enters burner B through the reversing valve and is heated as it flows through the hot ceramic sphere accumulator. Then, the normal temperature air is heated to a temperature close to the furnace chamber (generally 80% to 90% of the furnace chamber temperature). The heated high-temperature air enters the furnace chamber and then rolls up the flue gas around the furnace to form a thin oxygen-poor high-temperature airflow with an oxygen content lower than 21%. Then, the mixture of the oxygen-poor high-temperature air and the injected flue gas is ignited to smelt the aluminum material. At the same time, the high-temperature flue gas passes through burner A, the heat is stored in the cold ceramic sphere accumulator, and then the flue gas is discharged at a temperature lower than 150 °C through the flue gas pipe. When the stored heat reaches saturation, the reversing valve is reversed, and the regenerative burner A and B change their combustion and heat storage working states, and so on, resulting in energy savings (and reducing emissions).

4.2. Model Establishment

To construct the model for the prediction of the liquid aluminum temperature, 12 secondary variables were chosen as the input variables, which are shown in Table 3. These input variables were measured by the sensor. The measurement ranges and errors of the sensors are shown in Table 4. The sampling interval of each sampling point was five minutes. There were 4400 data samples collected for modeling, of which, 4000 samples were used as historical data for training, and 400 samples for model testing. To better test the effectiveness of the proposed method, two groups of data (D1 and D2) from different periods were used as the testing dataset, with 200 samples in each group.

The flowchart of JITL-TWRELM model is shown in Figure 3. To validate the performance of JITL-TWRELM, the six methods listed below were employed for comparison.

Method 1: JITL-RELM (it applies the original JITL strategy and original RELM).
Method 2: JITL-SWRELM (it applies the original JITL strategy and sample weights on RELM).
Method 3: WJITL-RELM (it applies the WJITL strategy and original RELM).
Method 4: JITL-VWRELM (it applies the original JITL strategy and local variable weights on RELM).
Method 5: JITL-DWRELM (it applies the WJITL strategy and sample weights on RELM).
Method 6: JITL-TWRELM (it applies the WJITL strategy, sample weights, and local variable weights on RELM).

The detailed step-by-step procedure of the proposed method is as follows.

Step 1: Prepare the input and output variables of the historical samples and perform the standardization.

Step 2: Determine the number N of training samples selected from the total historical samples, the parameter

φ

for the sample weight calculation, the hidden neuron number L, and the regularization coefficient C of the regularized extreme learning machine.

Step 3: Analyze the global correlation between the input variables and output variables of all historical samples. The global correlation coefficient matrix

Ω_{v}^{g}

is calculated for the sample similarity measurement.

Step 4: Calculate the weighted Euclidean distances between the current query samples and the training samples; N samples closest to the current query sample are selected as local modeling samples.

Step 5: Analyze the local correlation between input variables and output variables of the local modeling samples. The local correlation coefficient matrix

Ω_{v}^{l}

is determined.

Step 6: The JITL-TWRELM model is established, and the output of the current query sample is predicted.

Step 7: Before the next query sample arrives, the previous model is discarded and a new model is constructed based on the next query sample, enabling real-time updating of the model.

To evaluate the performance of the proposed method, four indices, including mean absolute error (

M A E

), root mean squared error (

R M S E

), mean absolute percentage error (

M A P E

), and coefficient of determination (

R^{2}

) are used in the performance evaluation, which are as follows:

M A E = \frac{1}{N_{T}} \sum_{i = 1}^{N_{T}} | y_{i} - {\hat{y}}_{i} |

(34)

R M S E = \sqrt{\frac{1}{N_{T}} \sum_{i = 1}^{N_{T}} {(y_{i} - {\hat{y}}_{i})}^{2}}

(35)

M A P E = \frac{1}{N_{T}} \sum_{i = 1}^{N_{T}} | \frac{y_{i} - {\hat{y}}_{i}}{y_{i}} |

(36)

R^{2} = 1 - \frac{\sum_{i = 1}^{N_{T}} {(y_{i} - {\hat{y}}_{i})}^{2}}{\sum_{i = 1}^{N_{T}} {(y_{i} - {\bar{y}}_{i})}^{2}}

(37)

where

N_{T}

denotes the number of samples used for testing,

y_{i}

and

{\hat{y}}_{i}

denote the values of the actual output variable and predicted output, respectively,

{\bar{y}}_{i}

denotes the mean value of the actual output variable. It is essential to have small

M A E

,

R M S E

, and

M A P E

, and large

R^{2}

for a prediction model.

Before establishing the JITL-TWRELM model, four parameters need to be determined. By trial and error experiments on dataset D1, N was set as a proper value of 200, which has a good prediction accuracy without increasing the computational burden. Similarly, the parameters

φ

and L are set to 0.3 and 20, respectively. Table 5 shows the prediction accuracy of the model under the different regularization coefficients. It can be seen that when

C = 150

, the model has a better effect.

4.3. Results and Discussion

To reduce the effect of randomness on the results, we took the average of ten tests as the final result. The prediction error indices of the six methods on two groups of the testing samples are shown in Table 6. We use testing dataset D1 as an example; in general, the proposed method (method 6) performed better than the other five methods on all four indices. Despite using the JITL strategy, the original method (method 1) had the worst performance on all indices. Methods 2, 3, 4, and 5 also achieved higher prediction accuracies than method 1, as neither the sample weights nor variable weights were used in method 1. Method 2 emphasizes the importance of the samples and introduces sample weights to reflect the effects of different samples on the output. Contrasted with the original JITL strategy, method 3 uses a weighted similarity measurement criterion; samples that are more similar to the query sample were selected to set up the local model, resulting in a more accurate prediction. Different from the previous methods, method 4 considers the local variable weights before establishing the model; the variable weights can be used to improve the influence of output-related variables and reduce that of irrelevant variables in feature extraction [33]. Although methods 2 to 4 have good prediction accuracy improvements, these methods only consider certain types of weighting strategies, such as individual sample weights or variable weights. Hence, method 5 introduces the sample weights and the WJITL strategy, and the

R^{2}

is improved from 0.89453 to 0.97690 compared with method 1. Meanwhile, based on methods 4 and 5—method 6 has the smallest

M A E

,

R M S E

, and

M A P E

, and the highest

R^{2}

among all methods. The

R^{2}

of method 6 is improved from 0.97690 to 0.98764 compared with method 5. Correspondingly, the proposed method 6 has good prediction accuracy on D2, the

R^{2}

reached 0.97427, which is 0.072 higher than method 1.

To more intuitively demonstrate the performances of these six methods, the detailed prediction results for each method on D1 and D2 are shown in Figure 4 and Figure 5, in which (a–f) shows the prediction results of the six methods, respectively. It is easy to see that the prediction of JITL-TWRELM matches well with the curve of the actual measurement of the furnace temperature, while the prediction curve of JITL-RELM cannot track with the real output curve in some samples. In addition, although the other four methods have certain improvements, they still do not achieve the desired effects. In summary, the flexibility and effectiveness of the proposed methods are validated.

The liquid aluminum temperature of the regenerative smelting furnace is generally controlled by feedback. The thermocouple is set in the furnace chamber, and if the temperature is detected to be lower than the set value, the regenerative burner starts to work. In a real industrial site, the temperature measurement performance of the thermocouple used to measure the temperature of aluminum liquids is often affected by voltage fluctuations and the aging of the protective jacket. Old thermocouples need to be replaced frequently, resulting in increased costs. The proposed method in this paper only requires the establishment of a historical database in the industrial site, and whenever a new query sample arrives, the modeling sample is selected from the historical database for modeling, and the prediction results of the aluminum liquid temperature can be obtained. As can be seen from Table 6, the

M A E

s of the proposed method 6 are 14.7273 and 14.8733 for the two test sets, respectively. Comparing the temperature range and measurement error of the thermocouple in Table 4, the accuracy of the proposed soft measurement model is close to the actual sensor, with a close to 2% error at the maximum temperature measurement range, but the efficiency and costs are more advantageous than the sensor. Therefore, the method proposed in this paper is significant for reducing production costs and improving product quality.

5. Conclusions

This paper mainly deals with the estimation of the liquid aluminum temperature in the regenerative aluminum smelting furnace. A JITL-TWRELM soft sensor modeling method is proposed. In this method, both the sample similarities and the variable correlations are considered in RELM to deal with the differences between samples and variables. Each modeling sample is assigned different weights according to the similarity calculation, and each dimension of the sample is also assigned a corresponding weight according to the correlation analysis, which improves the accuracy of the modeling compared with the original RELM. Furthermore, a weighted similarity measurement criterion is proposed for JITL to select similar samples for local modeling. Compared with the original JITL strategy, more similar modeling samples are selected for each query sample, enhancing the accuracy and reliability of the local modeling dataset. The flexibility and effectiveness of JITL-TWRELM were validated through the industrial aluminum smelting process. The industrial applications show that the proposed method can effectively deal with the nonlinear and time-varying problems in the regenerative aluminum smelting process and achieve a higher accuracy of temperature prediction compared with the other five methods.

For each query sample, the model needs to be updated once, although some adjacent query samples do not need to update the model so frequently. Selective updating of the model will improve the modeling efficiency. Therefore, developing a selective update strategy will be the focus of future work.

Author Contributions

Data curation, Y.L.; methodology, J.D.; supervision, J.D.; validation, X.C.; writing—original draft, X.C.; writing—review and editing, Y.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare that there are no conflict of interest regarding the publication of this paper.

References

Froehlich, C.; Strommer, S.; Steinboeck, A.; Niederer, M.; Kugi, A. Modeling of the media supply of gas burners of an industrial furnace. IEEE Trans. Ind. Appl. 2016, 52, 2664–2672. [Google Scholar] [CrossRef]
Strommer, S.; Niederer, M.; Steinboeck, A.; Kugi, A. A mathematical model of a direct-fired continuous strip annealing furnace. Int. J. Heat Mass Trans. 2014, 69, 375–389. [Google Scholar] [CrossRef]
Qiu, L.; Feng, Y.H.; Chen, Z.G.; Li, Y.L.; Zhang, X.X. Numerical simulation and optimization of the melting process for the regenerative aluminum melting furnace. Appl. Therm. Eng. 2018, 145, 315–327. [Google Scholar] [CrossRef]
Li, D.Y.; Song, Z.H. A novel incremental gaussian mixture regression and its application for time-varying multimodal process quality prediction. In Proceedings of the 2020 IEEE 9th Data Driven Control and Learning Systems Conference, Liuzhou, China, 19–21 June 2020. [Google Scholar]
Bao, L.; Yuan, X.F.; Ge, Z.Q. Co-training partial least squares model for semi-supervised soft sensor development. Chemometr. Intell. Lab. 2015, 147, 75–85. [Google Scholar] [CrossRef]
Wang, G.; Luo, H.; Peng, K.X. Quality-related fault detection using linear and nonlinear principal component regression. J. Franklin. Inst. 2016, 353, 2159–2177. [Google Scholar] [CrossRef]
Liu, H.B.; Yang, C.; Huang, M.Z.; ChangKyoo, Y. Soft sensor modeling of industrial process data using kernel latent variables-based relevance vector machine. Appl. Soft. Comput. 2020, 90, 106149. [Google Scholar] [CrossRef]
Shi, X.; Huang, G.L.; Hao, X.C.; Yang, Y.; Li, Z. A Synchronous Prediction Model Based on Multi-Channel CNN with Moving Window for Coal and Electricity Consumption in Cement Calcination Process. Sensors 2021, 21, 4284. [Google Scholar] [CrossRef]
Zhe, L.; Yi-Shan, L.; Chen, J.H.; Qian, Y.W. Developing variable moving window PLS models: Using case of NOx emission prediction of coal-fired power plants. Fuel 2021, 296, 120441–120456. [Google Scholar]
Mai, T.; Yu, X.L.; Gao, S.; Frejinger, E. Routing policy choice prediction in a stochastic network: Recursive model and solution algorithm. Transport. Res. B-Methodol. 2021, 151, 42–58. [Google Scholar] [CrossRef]
Doshi, P.; Gmytrasiewicz, P.; Durfee, E. Recursively modeling other agents for decision making: A research perspective. Artif. Intell. 2020, 279, 103202–103220. [Google Scholar] [CrossRef]
Qiu, K.P.; Wang, J.L.; Zhou, X.J.; Guo, Y.Q.; Wang, R.T. Soft sensor framework based on semisupervised just-in-time relevance vector regression for multiphase batch processes with unlabeled data. Ind. Eng. Chem. Res. 2020, 59, 19633–19642. [Google Scholar] [CrossRef]
Wu, Y.J.; Liu, D.J.; Yuan, X.F.; Wang, Y.L. A just-in-time fine-tuning framework for deep learning of SAE in adaptive data-driven modeling of time-varying industrial processes. IEEE. Sens. J. 2020, 21, 3497–3505. [Google Scholar] [CrossRef]
Chen, N.; Luo, L.H.; Gui, W.H.; Guo, Y.Q. Integrated modeling for roller kiln temperature prediction. In Proceedings of the 2017 Chinese Automation Congress, Shandong, China, 20–22 October 2017. [Google Scholar]
Dai, J.Y.; Chen, N.; Yuan, X.F.; Gui, W.H.; Luo, L.H. Temperature prediction for roller kiln based on hybrid first-principle model and data-driven MW-DLWKPCR model. ISA Trans. 2020, 98, 403–417. [Google Scholar] [CrossRef]
Yao, L.; Ge, Z.Q. Locally weighted prediction methods for latent factor analysis with supervised and semisupervised process data. IEEE Trans. Autom. Sci. Eng. 2016, 14, 126–138. [Google Scholar] [CrossRef]
Chen, J.M.; Yang, C.H.; Zhou, C.; Li, Y.G.; Zhu, H.Q.; Gui, W.H. Multivariate regression model for industrial process measurement based on double locally weighted partial least squares. IEEE. Trans. Instrum. Meas. 2019, 69, 3962–3971. [Google Scholar] [CrossRef]
Yuan, X.F.; Huang, B.; Ge, Z.Q.; Song, Z.H. Double locally weighted principal component regression for soft sensor with sample selection under supervised latent structure. Chemometr. Intell. Lab. 2018, 153, 116–125. [Google Scholar] [CrossRef]
Lin, W.L.; Hang, H.F.; Zhuang, Y.P.; Zhang, S.L. Variable selection in partial least squares with the weighted variable contribution to the first singular value of the covariance matrix. Chemometr. Intell. Lab. 2018, 183, 113–121. [Google Scholar] [CrossRef]
Yuan, X.F.; Li, L.; Shardt, Y.; Wang, Y.L.; Yang, C.H. Deep learning with spatiotemporal attention-based LSTM for industrial soft sensor model development. IEEE Trans. Ind. Electron. 2020, 65, 4404–4414. [Google Scholar] [CrossRef]
Yuan, X.F.; Li, L.; Wang, Y.L. Nonlinear dynamic soft sensor modeling with supervised long short-term memory network. IEEE Trans. Ind. Inform. 2019, 16, 3168–3176. [Google Scholar] [CrossRef]
He, Z.S.; Chen, Y.H.; Xu, J. A Combined Model Based on the Social Cognitive Optimization Algorithm for Wind Speed Forecasting. Processes 2022, 10, 689. [Google Scholar] [CrossRef]
Wang, X.L.; Zhang, H.; Wang, Y.L.; Yang, S.M. ELM-Based AFL–SLFN Modeling and Multiscale Model-Modification Strategy for Online Prediction. Processes 2019, 7, 893. [Google Scholar] [CrossRef] [Green Version]
Li, G.Q.; Chen, B.; Qi, X.B.; Zhang, L. Circular convolution parallel extreme learning machine for modeling boiler efficiency for a 300 MW CFBB. Soft. Comput. 2019, 23, 6567–6577. [Google Scholar] [CrossRef]
Hu, P.H.; Tang, C.X.; Zhao, L.C.; Liu, S.L.; Dang, X.M. Research on measurement method of spherical joint rotation angle based on ELM artificial neural network and eddy current sensor. IEEE Sens. J. 2021, 21, 12269–12275. [Google Scholar] [CrossRef]
Su, X.L.; Zhang, S.; Yin, Y.X.; Xiao, W.D. Prediction model of hot metal temperature for blast furnace based on improved multi-layer extreme learning machine. Int. J. Mach. Learn. Cybern. 2019, 10, 2739–2752. [Google Scholar] [CrossRef]
Huang, Q.B.; Lei, S.N.; Jiang, C.L.; Xu, C.H. Furnace Temperature Prediction of Aluminum Smelting Furnace Based on KPCA-ELM. In Proceedings of the 2018 Chinese Automation Congress, Xi’an, China, 30 November–2 December 2018. [Google Scholar]
Liu, Q.; Wei, J.; Lei, S.; Huang, Q.B.; Zhang, M.Q.; Zhou, X.B. Temperature prediction modeling and control parameter optimization based on data driven. In Proceedings of the 2020 IEEE Fifth International Conference on Data Science in Cyberspace, Hong Kong, China, 24–27 June 2020. [Google Scholar]
Li, Z.X.; Hao, K.R.; Chen, L.; Ding, Y.S.; Huang, B. Pet viscosity prediction using jit-based extreme learning machine. In Proceedings of the 2018 IFAC Conference, Changchun, China, 20–22 September 2018. [Google Scholar]
Wang, Y.; Li, Y.G.; Wu, B.Y. Improved regularized extreme learning machine short-term wind speed prediction based on gray correlation analysis. Wind. Eng. 2021, 45, 667–679. [Google Scholar]
Pan, Z.Z.; Meng, Z.; Chen, Z.J.; Gao, W.Q.; Shi, Y. A two-stage method based on extreme learning machine for predicting the remaining useful life of rolling-element bearings. Mech. Syst. Signal Process. 2020, 144, 106899–106916. [Google Scholar] [CrossRef]
Zong, W.W.; Huang, G.B.; Chen, Y.Q. Weighted extreme learning machine for imbalance learning. Neurocomputing 2013, 101, 229–242. [Google Scholar] [CrossRef]
Chen, N.; Dai, J.Y.; Yuan, X.F.; Gui, W.H.; Ren, W.T.; Koivo, H. Temperature prediction model for roller kiln by ALD-based double locally weighted kernel principal component regression. IEEE Trans. Instrum. Meas. 2018, 67, 2001–2010. [Google Scholar] [CrossRef]

Figure 1. The structure of ELM.

Figure 2. (a) An industrial regenerative aluminum smelting furnace; (b) the internal structure of the regenerative aluminum smelting furnace.

Figure 3. The flowchart of JITL-TWRELM model.

Figure 4. The detailed prediction results of six methods on D1; (a) method 1; (b) method 2; (c) method 3; (d) method 4; (e) method 5; (f) method 6.

Figure 5. The detailed prediction results of six methods on D2; (a) method 1; (b) method 2; (c) method 3; (d) method 4; (e) method 5; (f) method 6.

Table 1. Definition of symbols in this paper.

Symbols	Definition
$x_{n}, t_{n}$	the nth historical input and output variable vectors
$β_{i}$	the output weight of the ith hidden layer unit
$β$ , $β^{S}$ , $β^{V}$ , $β^{t}$	the output weight vectors in RELM, SWRELM, VWRELM, JITL-TWRELM
T	the output vector of RELM
$ω_{i}$ , $b_{i}$	the input weight and bias connecting input layer and ith hidden layer unit
$t_{j}$ , $\overset{\land}{t_{q}^{t}}$	the output corresponding to $x_{j}$ , the output of the query sample in JITL-TWRELM
N	the number of training samples
C	the regularization coefficient
$ξ$	the training error vector
H, $H^{V}$ , $H^{t}$	the hidden layer output matrices in RELM, VWRELM, JITL-TWRELM
$Ω_{s n}, Ω_{s}, Ω_{s}^{t}$	the sample weight of the nth sample, the sample weighted matrix, the sample weighted matrix in JITL-TWRELM
$λ$	the Lagrange multiplier vector
$ρ$	the Pearson correlation coefficient
$E (x)$ , $E (t)$	the expectation of the single input variable and output variable
$v_{i}$	the contribution of each variable
V	the variable contribution matrix
$x_{n}^{v}, x_{n}^{w}$	the variable weighted input sample, the variable weighted local modeling sample
$d_{o n}, d_{o n}^{w}, d_{o n}^{t w}$	the original Euclidean distance and weighted Euclidean distance, the weighted Euclidean distance in JITL-TWRELM
$x_{q}, x_{q}^{w}$	the query sample, the variable weighted query sample in JITL-TWRELM
$Ω_{v}, Ω_{v}^{l}, Ω_{v}^{g}$	the correlation coefficient matrix, the local correlation coefficient matrix, the global correlation coefficient matrix
$φ$	the adjusted parameter
$X, X^{w}$	the local modeling sample matrix, the variable weighted local modeling sample matrix in JITL-TWRELM

Table 2. Shortcomings of the three methods.

Method	Shortcomings
RELM	Neither sample similarities nor variable correlations are considered, and the model cannot be updated in real-time.
SWRELM	Only the sample similarities are considered, no variable correlations are considered, and the model cannot be updated in real-time.
VWRELM	Only the variable correlations are considered, no sample similarities are considered, and the model cannot be updated in real-time.

Table 3. Input variables for the model in the aluminum smelting process.

Input	Variable
1	Material temperature
2	Furnace pressure
3	12 # combustion airflow
4	12 # combustion air pressure difference
5	34 # combustion airflow
6	34 # combustion air temperature
7	34 # combustion air pressure difference
8	34 # gas air-fuel ratio
9	B1 # exhaust gas temperature
10	B2 # exhaust gas temperature
11	B3 # exhaust gas temperature
12	B4 # combustion air temperature

Table 4. Sensor measurement range and error.

Sensor Type	Measurement Range	Measurement Error
Pressure meter	0–15,000 Pa	1%
Flow meter	0–15 m³/h	1.5%
Thermocouple	0–1300 °C	1%

Table 5. Comparison of the modeling accuracy with C.

C	$MAE$	$RMSE$	$MAPE$	$R^{2}$
140	15.0884	18.6443	0.020354	0.98666
150	14.7273	17.9456	0.019897	0.98764
160	15.3797	19.5019	0.020879	0.98541
170	15.5217	20.4519	0.021023	0.98395
180	15.8944	20.1194	0.021629	0.98447
190	16.0543	19.8318	0.021839	0.98491
200	17.1959	22.2015	0.023265	0.98109

Table 6. The indices of the six methods of two groups of testing datasets.

Dataset	Method	$MAE$	$RMSE$	$MAPE$	$R^{2}$
D1	JITL-RELM	43.0278	52.4279	0.062519	0.89453
	JITL-SWRELM	38.1265	47.9605	0.052589	0.91174
	WJITL-RELM	32.7444	42.7606	0.044443	0.92984
	JITL-VWRELM	38.0149	46.1055	0.054197	0.91843
	JITL-DWRELM	20.7980	24.5347	0.029768	0.97690
	JITL-TWRELM	14.7273	17.9456	0.019897	0.98764
D2	JITL-RELM	26.7981	36.1509	0.035878	0.90223
	JITL-SWRELM	26.3624	33.526	0.03657	0.91174
	WJITL-RELM	27.4121	34.5595	0.044443	0.91065
	JITL-VWRELM	24.3605	31.4511	0.032461	0.92600
	JITL-DWRELM	16.2472	22.2734	0.021632	0.96289
	JITL-TWRELM	14.8733	18.5463	0.019646	0.97427

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Chen, X.; Dai, J.; Luo, Y. Temperature Prediction Model for a Regenerative Aluminum Smelting Furnace by a Just-in-Time Learning-Based Triple-Weighted Regularized Extreme Learning Machine. Processes 2022, 10, 1972. https://doi.org/10.3390/pr10101972

AMA Style

Chen X, Dai J, Luo Y. Temperature Prediction Model for a Regenerative Aluminum Smelting Furnace by a Just-in-Time Learning-Based Triple-Weighted Regularized Extreme Learning Machine. Processes. 2022; 10(10):1972. https://doi.org/10.3390/pr10101972

Chicago/Turabian Style

Chen, Xingyu, Jiayang Dai, and Yasong Luo. 2022. "Temperature Prediction Model for a Regenerative Aluminum Smelting Furnace by a Just-in-Time Learning-Based Triple-Weighted Regularized Extreme Learning Machine" Processes 10, no. 10: 1972. https://doi.org/10.3390/pr10101972

APA Style

Chen, X., Dai, J., & Luo, Y. (2022). Temperature Prediction Model for a Regenerative Aluminum Smelting Furnace by a Just-in-Time Learning-Based Triple-Weighted Regularized Extreme Learning Machine. Processes, 10(10), 1972. https://doi.org/10.3390/pr10101972

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Temperature Prediction Model for a Regenerative Aluminum Smelting Furnace by a Just-in-Time Learning-Based Triple-Weighted Regularized Extreme Learning Machine

Abstract

1. Introduction

2. Related Methods

2.1. RELM

2.2. SWRELM

2.3. VWRELM

3. The Proposed JITL-TWRELM Model

3.1. Weighted Similarity Measurement Criterion

3.2. JITL-TWRELM

4. Industrial Case

4.1. Process Description of the Regenerative Aluminum Smelting Furnace

4.2. Model Establishment

4.3. Results and Discussion

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI