Next Article in Journal
Green’s Function Related to a n-th Order Linear Differential Equation Coupled to Arbitrary Linear Non-Local Boundary Conditions
Next Article in Special Issue
Optimal Stochastic Control in the Interception Problem of a Randomly Tacking Vehicle
Previous Article in Journal
Mathematical Modelling by Help of Category Theory: Models and Relations between Them
Previous Article in Special Issue
Models of Strategic Decision-Making under Informational Control
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Multi-Output Soft Sensor with a Multivariate Filter That Predicts Errors Applied to an Industrial Reactive Distillation Process

1
Process Control Laboratory, Institute of Automation and Control Process FEB RAS, 5 Radio Str., Vladivostok 690041, Russia
2
Department of Automation Engineering, Technical University of Ilmenau, 99084 Ilmenau, Germany
3
Beijing National Research Center for Information Science and Technology, Department of Automation, Tsinghua University, Beijing 100084, China
*
Author to whom correspondence should be addressed.
Mathematics 2021, 9(16), 1947; https://doi.org/10.3390/math9161947
Submission received: 20 June 2021 / Revised: 11 August 2021 / Accepted: 13 August 2021 / Published: 15 August 2021

Abstract

:
The paper deals with the problem of developing a multi-output soft sensor for the industrial reactive distillation process of methyl tert-butyl ether production. Unlike the existing soft sensor approaches, this paper proposes using a soft sensor with filters to predict model errors, which are then taken into account as corrections in the final predictions of outputs. The decomposition of the problem of optimal estimation of time delays is proposed for each input of the soft sensor. Using the proposed approach to predict the concentrations of methyl sec-butyl ether, methanol, and the sum of dimers and trimers of isobutylene in the output product in a reactive distillation column was shown to improve the results by 32%, 67%, and 9.5%, respectively.

1. Introduction

As the size and complexity of industrial systems increases, there is a need to accurately measure most process variables. Unfortunately, not all variables can be accurately measured using online hard sensors. For certain variables, such as concentration or density, the only accurate measurements can be obtained by manually taking samples and analyzing them in a laboratory. One solution to this problem is the development of soft sensors, which take the easy-to-measure variables and create models to predict the hard-to-measure variables [1].
All soft sensor systems consist of a process model that takes the easy-to-measure variables and provides an estimate of the hard-to-measure variables. These models can be constructed using methods ranging from linear regression to principal component analysis and support vector machines. Although the main focus has been on the development of the soft sensor models [2,3,4,5], advanced soft sensor systems have also a bias update term that can take any slowly sampled information to update the soft sensor prediction [1]. This bias update term is normally designed as some function of the difference between the predicted and measured values [6]. Of note, it should be mentioned that the measured values are often sampled very slowly and with considerable time delay. This means that during the points at which there are no updates, the previously available bias value is used. When such a system is properly designed, it can provide good tracking of the process, i.e., the predicted and measured values are close to each other.
Recently, it has been suggested that instead of only using the available slowly sampled data for updating the bias term, it should be possible to also model the historical errors and use them to predict the future errors [7]. It has been shown that such an approach can improve the overall performance of the soft sensor system. However, there still remain issues with how best to model and implement this predictive bias update term. Furthermore, there are issues with incorporating time delays into this approach since they will greatly increase the size of the required search space.
Therefore, this paper will examine the development of a predictive bias update term for a nonlinear system using dimension reduction. The proposed approach will be tested using data from an industrial reactive distillation column that produces methyl tert-butyl ether (MTBE).

2. Background

Consider the soft sensor system shown in Figure 1, where ut is the input, yt is the measured (true) output, y ^ m , t the predicted soft sensor value, y ^ α , t and y ^ β , t are intermediate soft sensor values, Gp is the true process, G ^ p is the soft sensor process model, and GB is the bias update term. It can be noted that purpose of the bias update term is to take the information from the measured values and correct the output of the soft sensor system. This comes primarily from the unknown disturbances and the inherent plant-model mismatch.
Another approach to this problem is to re-arrange the bias update term so that it contains a predictive model that can predict the errors between the measured and predicted values. This re-arrangement is shown in Figure 2, where the predicted value from the soft sensor is corrected based on the modeled errors of the system. The question becomes how to design this model so that the best predictions can be obtained.
For prediction of time series, the Box-Jenkins methodology is traditionally used, according to which the time series model is found in the class of autoregressive-moving average (ARMA) models, i.e., is considered a rational algebraic function of the backward shift operator. The flexibility of the ARMA class makes it possible to find parsimonious models, i.e., the adequacy of the evaluated model is achieved with a small number of estimated parameters. Since this property is especially important for empirical models, the Box-Jenkins methodology is widely used to solve various practical problems. This approach is adopted in this paper.
In industrial processes, where it is desired to implement the model on programmable logic control (PLC) units, the complexity of the model G ^ p can be an issue. Therefore, this paper will consider a simple model for G ^ p of the form
yt = b0 + bxt + et
where b are the parameters to be estimated and xt is the input(s). Model (1) can be improved by taking into account possible delays of the output variables relative to inputs. Consider the following model for a multi-output soft sensor
yt, m = bm um (t, τm) + et, m
where t = 1, 2, …, n; m = 1, 2, 3 (the number of outputs m is given by the industrial production team and reflecting the key quality indices of MTBE product). Vector bm = (bm, 1, bm, 2, …, bm, 10) is a row vector of unknown coefficients; τm = (τm, 1, τm, 2, …, τm, 10) is a row vector of unknown time delays; um (t, τm) = (ut, m, 1, ut, m, 2, …, ut, m, 10)T; ut, m, k is the measurement of the xk value at time tτm, k with k = 1, 2, …, 10. Please note that it has been assumed here that the maximal time delay is 10 samples and justified from the industrial process dynamics point of view. However, it can easily be extended to arbitrary values.
Solving model (2) by minimizing the mean squared error (MSE) gives an estimate for the unknown parameters b ^ m and τ ^ m . The MSE depends not only on the coefficients bm, but also on the delays τm, i.e.,
D e m ( b m   τ m ) = 1 n t = 1 n { y t m   b m u m ( t ,   τ m ) } 2 ,   m   =   1 ,   2 ,   3
Thus,
( b ^ m ,   τ ^ m ) = arg min b m ,   τ m D e m ( b m ,   τ m ) .
Please note that if D e m ( b m * ,   τ m * ) = min b m ,   τ m D e m ( b m ,   τ m ) , than D e m ( b m * ,   τ m * ) = min b m D e m ( b m ,   τ m * ) .
Consequently,
min b m ,   τ m D e m ( b m ,   τ m ) = min   τ m { min b m D e m ( b m ,   τ m ) } = min   τ m D e m ( b ^ m ,   τ m )  
Furthermore, the estimates b ^ m are found using standard regression analysis which gives
b ^ m = { ( U m T U m ) 1 U m T Y m } T ,   m = 1 ,   2 ,   3
where Ym is the m-th column of the matrix Y; Um is a matrix with dimension n × 10, whose t-th row is the row um (t, τm)T.
Since all variables are measured at discrete moments in time, the gradient descent methods cannot be directly applied to minimize the objective function D e m ( b ^ m ,   τ m ) for the argument τm. However, this difficulty can be avoided by calculating Dem for any values of the elements of the vector τm by interpolating between the nearby nodes of the discrete grid. Interpolation with a large search space dimension is a difficult problem. Among the various characteristics of the algorithms used, such properties as visibility and relative simplicity come to the fore. Therefore, in this situation, the most preferable is the polynomial interpolation.

2.1. Error Modeling

If the et, m error were known at time t − 1, then using Equation (2), it would be possible to predict the yt, m variable with absolute accuracy. Unfortunately, the et, m error is not known in advance, but it can be predicted using any statistical patterns found in the sequence e1, m, e2, m, …. This error prediction can be used as a correction to model (2) as shown in Figure 2, therefore improving the prediction accuracy of the yt,m output variable. To evaluate a predictive model for the sequence e1, m, e2, m …, let us consider the class of ARMA models. Let us introduce the predicted process as the output of an invertible linear filter, called a shaping filter, driven by white noise, i.e., a process with a constant spectral density. In this case, the transfer function of the shaping filter is considered a rational algebraic function of the backward shift operator, i.e.,
e t = l = 1 N n ( 1 H l q 1 ) k = 1 N d ( 1 G k q 1 ) ε t
where εt and et are values of the input and output processes of the shaping filter at time t; Nn is the order of the moving average; Nd is the order of the autoregressive component; Hl, Gk are constants (generally speaking, complex-valued); and q 1 is the backshift operator. The stationarity and invertibility conditions, which are necessary to predict the et process, are [8]
| G k | < 1 ,   k = 1 ,   ,   N d ; | H l | < 1 ,   l = 1 ,   ,   N n
The flexibility of the ARMA class provides the possibility of finding parsimonious models, i.e., the adequacy of the constructed model is achieved with a relatively small number of estimated parameters. Since this property is especially important for empirical models, the models with the structure given in Equation (7) and their variants are widely used for solving practical problems.
The filter for predicting the et process can be found using the prediction error method (PEM) [9]. Expanding the brackets in Equation (7) gives
e t = ( 1 θ 1 q 1 θ N n q N n ) ( 1 η 1 q 1 η N d q N d ) ε t
where θl and ηk are the model parameters. It is assumed that the polynomials in the numerator and denominator have no common roots, since otherwise it would be possible to reduce the common multipliers in the numerator and denominator of Equation (7).
The PEM function finds the parameter values that minimize the predictive MSE of the et process for given polynomial orders (Nn, Nd) and the initial estimates of the parameters θl and ηk. It is possible to choose suitable orders of the polynomials based on sample estimations of the spectral density of the considered process. Recall that the frequency response of the shaping filter is the value of Equation (7) on a circle of unit radius centered on the origin and the spectral density S(ω) of the output process et is equal to the product of the variance of the input process and the square of the frequency response modulus, i.e., [10]
S ( ω ) = σ ε 2 l = 1 N n ( 1 H l e j ω ) k = 1 N d ( 1 G k e j ω ) l = 1 N n ( 1 H ¯ l e j ω ) k = 1 N d ( 1 G ¯ k e j ω ) ,
where σ ε 2 is the variance of random process εt and Hl and Gk are the complex conjugates of the constants Hl and Gk. Furthermore, since we desire that our filter be invertible, it follows that for the model
ε t = k = 1 N d ( 1 G k q 1 ) l = 1 N n ( 1 H l q 1 )   e t
the et process is invertible if the absolute values of all the Hl constants are less than one. Similarly, if the absolute values of all the Gk constants is less than one, then the et process is stationary [8]. Thus, although multiple processes can have the same spectral density, there is only one that is both stationary and invertible.
Once the general model has been obtained, we can rewrite it as an infinite impulse response model, i.e.,
e t = ε t + k = 1 ψ k ε t k
where ψ is an impulse response coefficient. Since we know that the general model converges [8], it follows that we only need a finite number of terms in Equation (12). Furthermore, we note that
e t i = ε t i + k = 1 ψ k ε t i k
which implies that for any positive i the random variables εt and eti are uncorrelated (since the process εt is white noise). Therefore, successively multiplying both sides of Equation (12) by the values of the corresponding process at delays i and taking expectations, we obtain equations for finding the initial estimates of the parameters that involve the covariances of the errors for different lags [10]. Obviously, since the true covariances are not known, they will need to be replaced by the sample estimates. This method of estimating the coefficients does not lead to too large error as long as the absolute values of the parameters of model (7) are not too close to the boundary of unit circle centered on the origin. Thus, it is possible to design the required filter.

2.2. Filter Design

Let et = (et, 1, et, 2, …, et, N)T be an N-dimensional stationary process of the soft sensor’s errors whose shaping filter transfer matrix is F0(q−1), i.e.,
et = F0(q−1)εt
where q−1 is the backshift operator; εt = (εt, 1, εt, 2, …, εt, N)T is an N-dimensional vector of white noise; and F0(q−1) = [fkm(q−1)] is an N × N matrix function, whose entries denoted as fkm(q−1) are the rational transfer function from εt,m to et, k. Thus, it is desired to construct the filter that will predict et+1 given the past values.
Let P(q−1) be the desired one-step ahead predictor transfer matrix, e ^ t + 1 = P(q−1)et the prediction of the vector et+1 at time t, and ε ˜ t + 1 = et+1   e ^ t + 1 the error of the prediction obtained with the aid of the filter P(q−1). Then
ε ˜ t = e t e ^ t = e t q 1 e ^ t + 1 = e t q 1 P ( q 1 ) e t = [ I N q 1 P ( q 1 ) ] e t
where IN is identity matrix of order N. Consequently, the filter in the square brackets transforms the initial series into the prediction error series. If the random vector ε ˜ t includes components correlated with those of the vector ε ˜ t j at some j > 0, we can predict the errors ε ˜ t using the known previous errors. Using those predictions as corrections to the e ˜ t that were obtained, we could improve the accuracy of the predictions. Hence, in order to maximize the predictor accuracy, we must find a P(q−1) such that the errors ε ˜ t are uncorrelated with the errors ε ˜ t j at any j > 0 with some nonzero correlation between the components of ε ˜ t (i.e., at j = 0) being admissible. In other words, the time series ε ˜ t must be N-dimensional white noise. Consequently, INq−1P(q−1) = F0−1(q−1), from which it follows that P(q−1) = q[IN F0−1(q−1)].
Thus, the predictor transfer matrix P(q−1) can be expressed through the transfer matrix of the shaping filter F0(q−1). The matrix F0(q−1) can be found from
G(q−1) = F0(q−1)F0T(q),
where G(q−1) = [gkm(q−1)], gkm(q−1) is the q-transform of the statistical estimate of the cross-covariance function of the time series et, k and et, m (in particular, when m = k, gmm is a q-transform of the sample covariance function, i.e., the autocovariance generating function (AGF) of the time series etm).
The algorithm for finding F0(q−1) is simplified by decomposing it into N stages. At the kth stage, a shaping filter Fk(q−1) of the k-dimensional process (et, 1, et, 2, …, et, k)T is found. At this stage, the filter Fk−1(q−1), found at the (k−1)th stage, is used in order to transform the matrix Gk(q−1) = Fk(q−1)FkT(q) so that its transform contains nonzero elements in only one line, one column, and on the main diagonal. This technique substantially simplifies the procedure of spectral factorization (finding the matrix function Fk(q−1)) [11].
The proposed approach allows us to identify the vector time series transfer matrix without resorting to a complicated phase state representation. This advantage is used to obtain an adequate model with relatively few estimated parameters for the initial time series shaping filter F0(q−1). Simultaneously, the model for the transfer matrix of the inverse filter F0−1(q−1), which transforms the initial time series into the white noise, is also found.
The algorithm for constructing both the shaping filter F0(q−1) and its inverse F0−1(q−1) is described in [11]. Based on this algorithm, the sequence of prediction errors ε ˜ t should be N-dimensional white noise. However, since in practice, the true characteristics of the original process are not known, but only their estimates, containing inevitable statistical errors, in reality, the properties of the sequence ε ˜ t can be significantly different from the properties of white noise. Thus, to verify the optimality of the resulting model P(q−1) of the predictive filter, a criterion is needed to test the hypothesis that the process ε ˜ t is N-dimensional white noise. To construct such a criterion, we can transform the process ε ˜ t in such a way that its spectral density matrix is diagonal. Such a transformation is achieved by means of a rotation of axes in the N-dimensional variable space ε ˜ 1 ,   ε ˜ 2 , , ε ˜ N [12]. Since the variances of these variables can be made equal to each other by normalization, without loss of generality, we suppose that spectral density matrix of the noise ε ˜ t is an N × N identity matrix IN.
Consider a univariate sequence ξk = ε ˜ t j , m , where k = jN + m. Please note that each pair couple (j, m) determines one k and each k determines one pair couple (j, m). Consequently, ε ˜ t is multivariate white noise if and only if ξk is univariate white noise. It is known that the spectral density of univariate white noise is constant [8,13]. Thus, testing the hypothesis that ε ˜ t is multivariate white noise is reduced to testing the hypothesis on the constancy of the spectral density of a univariate sequence. This hypothesis can be tested using Kolmogorov’s criterion [14].
Please note that only a time series containing prediction errors is used as the initial information for constructing a predictor with the proposed approach. Information about the model with which the predictions were obtained is not used. Therefore, this approach is applicable to any predictive model that involves errors, regardless of the specific properties of the model used.

2.3. Summary of the Proposed Approach

Thus, the proposed procedure for developing the model can be summarized as follows:
Step 1: Create an initial sample ut, yt, t = 1, 2, …, K. If the plant is already functioning then the initial sample consists of the historical values of ut, yt. Otherwise, the initial sample is forming during the trial period of the plant. The initial sample is divided into training and testing datasets.
Step 2: Based on the data included in the training sample, the coefficients and delays of the model given by Equation (2) are estimated via solving optimization problem (4).
Step 3: Based on the data included in the training sample, the errors for the model and the corresponding sample spectrum of errors are calculated.
Step 4: Based on the sample spectrum, the order of the ARMA model is selected in order to predict the unknown future error given the known current and past errors.
Step 5: The least squares method is used to find the values of the ARMA model parameters.
Step 6: The ARMA model obtained is used as the predictive filter F(q−1) in the feedback loop of the compensator (bias update term) as shown in Figure 2.
Step 7: If the resulting soft sensor improves the accuracy of the prediction for the test sample then it can be recommended for practical use.
Please note that the obtained predictive filter model can be recommended for further use for the same plant on the data of which it was built. As for the approach, it will certainly be successful if the sequence of errors of the plant is a stationary (or close to it) process. In addition, the class of successful applicability of this approach can be extended to those plants, for whose errors it is possible to find an invertible transformation that brings the sequence of errors to a stationary process. The quality of the developed model should be checked on a test sample that was not used at the stage of the model training.

3. Industrial Application of the Proposed Method

Industrial methyl tert-butyl ether (MTBE) production occurs in a reactive distillation unit, as shown in Figure 3. The feed containing isobutylene and methanol (MeOH) enters the column. The distillate (D) is a lean butane-butylene fraction with a certain amount of MeOH. The raffinate is the heavy product MTBE that is withdrawn from the bottom part of the column. Table 1 shows the main process variables for the industrial unit. The goal is to develop a soft sensor for the prediction of the concentrations of methyl sec-butyl ether (MSBE), MeOH, and the sum of dimers and trimers of isobutylene (DIME) in the bottom product MTBE.
The measured values of output ym and input xk variables at the time moment t are denoted as ytm, xtk; m = 1, 2, 3; k = 1, 2, …, 10; and t = 1, 2, …, n. The existing measurements may be used for development of a predictive model of the form
yt = b0 + bxt + et, t = 1, 2, …, n
where yt = (yt, 1, yt, 2, yt, 3)T; xt = (xt, 1, xt, 2, …, xt, 10)T; b is a matrix of the model parameters [bmk] of dimension 3 × 10; b0 = (b1, b2, b3)T is a vector of the constant biases; et = (et, 1, et, 2, et, 3)T is a vector of the residuals, and the superscript T denotes the transpose. Since Equation (17) can be rewritten as
( y t     y ¯ ) = b ( x t x ¯ ) + e t
where y ¯ = 1 n t = 1 n y t , x ¯ = 1 n t = 1 n x t , then expectations of all the elements of vectors yt, xt, and et, as well as biases vector b0, may be considered to be equal to zero without loss of generality.
Although the elements of matrix b are unknown, they are easily estimated using the ordinary least squares (OLS) method, which gives [10]
b ^ = { ( X T X ) 1 X T Y } T  
where X = [xtk]; Y = [ytm]; m = 1, 2, 3; k = 1, 2, …, 10; and t = 1, 2, …, n.
For the training sample containing n = 400 measurements, the following estimates were obtained:
x ¯ = ( 51.8154 1.8747 52.1154 3.0859 51.9866 0.7580 60.7100 66.4516 136.3077 64.5725 ) T
y ¯ = ( 0.5440 0.1461 0.0595 ) T
b ^ = ( 0.0151 0.2383 0.0342 0.1401 0.0476 ... 0.0173 0.0794 0.0281 0.1191 0.0171 ... 0.0080 0.1118 0.0061 0.0537 0.0134   1.7361 0.0430 0.0012 0.1019 0.0388 2.9800 0.0093 0.0072 0.1333 0.0353 0.3490 0.0215 0.0011 0.0467 0.0098 ) .
The estimated MSE vector for the model (17) is (0.0094 0.0095 0.0021)T, while the vector of sample estimates of variances of the output variables is (0.0321 0.0184 0.0047)T.
Let R m 2 be a sample estimate of the coefficient of determination, i.e., the estimate of a fraction of variance of the dependent variable ym explained by model (18), i.e.,
R m 2 = 1 D e , m D m  
where Dm is a sample estimate of the variance of the output variable ym, De, m is the mean squared value of the et, m errors, and m = 1, 2, 3. This gives R 1 2 = 0.7061, R 2 2 = 0.4822, and R 3 2 = 0.5467.
Assuming a sampling time of one hour, the estimates of the delay vector τ ^ 1 for predicting the output variable y1 is
τ ^ 1 = ( 4.83 0 2.00 5.00 1.83 0 2.00 0.83 1.00 2.00 )
and the estimate of the coefficient vector is equal to
b ^ 1 = ( 0.0002 0.1341 0.0360 0.0064 0.0451 2.3289 0.0519 0.0029 0.0819 0.0442 )
with De, 1( b ^ 1 , τ ^ 1 ) = 0.0091.
Similarly, for variables y2 and y3, we obtain
τ ^ 2 = ( 0.33 0.33 1.67 4.50 0.50 0.67 0.33 0.50 0.50 1.67 )
b ^ 2 = ( 0.0263 0.1481 0.0315 0.1947 0.0168 3.4223 0.0064 0.0092 0.1513 0.0385 ) ;   D 2 ( b ^ 2 ,   τ ^ 2 ) = 0.0088
τ ^ 3 = ( 4.17 0 0.83 4.33 0.83 0.50 2.00 0.67 0.83 1.00 )
b ^ 3 = ( 0.0021 0.0811 0.0070 0.0016 0.0130 0.3795 0.0259 0.0015 0.0455 0.0098 ) ;   D 3 ( b ^ 3 ,   τ ^ 3 ) = 0.0020
The sample estimate of the coefficient of determination to predict the output variable ym denoted by R L m 2 is R L 1 2 = 0.7160; R L 2 2 = 0.5200; R L 3 2 = 0.5726.
The effect of delay accounting was evaluated on a test sample containing 167 measurements. As a result, the MSE of the predictions of output variables y1, y2 and y3 decreased by 23%, 10%, and 3%, respectively.
Now, let us consider modeling the error term. From the spectral density of the errors for et, 1 and et, 3 shown in Figure 4 and Figure 5, it can be seen that the maximum within the interval [0, 0.5] Hz indicates the presence in the denominator of the spectral density function S(ω) a factor (1 − Ge−jω) with a complex-valued constant G. Since the sampling time is equal to 12 h, the frequency unit 1/(12 h) is used instead of Hz. However, for the practical application of the filter given by Equation (9), it is necessary that all the coefficients be real [8]. Therefore, the denominator of density S(ω) must contain a factor (1 − G ¯ e) along with a factor (1 − Ge). If the frequency response models for et, 1 and et, 3 processes are limited to these two factors (assuming the numerator is equal to one), then the corresponding spectral density of the second-order autoregressive process approximates well the sample estimates of the spectrum of et, 1 and et, 3 processes at different values of G. However, the insufficiently rapid decrease of the spectral density in the high-frequency region justifies the inclusion in the denominator of the model another multiplier with a real value of the constant G.
In Figure 6, which shows the spectral density for the et, 2 errors, the sample spectrum of this time series resembles the spectrum of a first-order autoregressive process [15,16,17]. However, we note that the stochastic process is not uniquely determined by its spectral density [8]. Therefore, as previously mentioned, we need to include two additional constraints that the resulting model be invertible and realizable. This will ensure that we have a unique model.
Based on the theoretical properties of the process, the error models are
et,1η11e(t−1), 1η21e(t−2), 1η31e(t−3), 1 = εt, 1
et, 2η12e(t−1), 2 = εt, 2
et,3η13e(t−1), 3η23e(t−2), 3η33e(t−3), 3 = εt, 3
where η are the parameters to be determined. These parameters can be found using the approach presented in Section 2.2 by multiplying the finite impulse response model by the delayed errors and taking the expectations. For example, for e1, this gives
γi = η11γi−1 + η21γi−2 + η31γi−3, i = 1, 2, 3
where γi = cov(et1, e(ti)1) = γi.
For the process et, 1, the estimates of the coefficients η11, η21 and η31 are, respectively, equal to 0.4131, −0.0093, and −0.0528. These values were used as the initial guesses passed to the PEM function. As a result of calculations, the model parameters were found to be: η11 = 0.4175, η21 = 0.03234, η31 = −0.07026. The initial value of coefficient η12 is 0.3748 and its final value is η12 = 0.3758.
Similarly, using Equation (22), the initial guesses were η13 = 0.5142, η23 = −0.0507, and η33 = −0.0207 to give final values of η13 = 0.5151, η23 = −0.02676, and η33 = −0.03246.
The performance of predictive filter models obtained from the analysis of the training dataset is validated using the testing sample. Figure 7, Figure 8 and Figure 9 compare the predictions against the true values, where the solid line shows the true et, m errors and the dashed line their predicted values for m = 1, 2, and 3. At the time point t on the x-axis, the corresponding error et, m and the predicted error e ^ t , m computed at t − 1.
Figure 10, Figure 11 and Figure 12 compare the performance of the soft sensors with the proposed filter for error prediction and a traditional method, in which adaptive bias term is calculated based on the moving window (MW) approach [18]. It can be seen that the filter provides better tracking of the process values, therefore improving the accuracy of the overall soft sensor system reducing the MSE of the output variables y1, y2, and y3 by 32%, 67%, and 9.5%, respectively.

4. Conclusions

This paper proposed a new approach to handling the bias update term in a soft sensor system. Rather than purely using available samples, the new bias update term seeks to predict what the errors will be in the future. Tests of this approach on a reactive distillation column show that the approach can handle the errors well. However, the predictive filters used only work for areas without serious disturbances or outliers.
Therefore, it makes sense to consider more complex models for the predictive filters including models with an additional component in the form of some flow, for example, Poissonian flow, of events (outliers). If the flow of outliers is added to the process model then the intensity of this flow needs to be estimated. In this case, the number of outliers in the training dataset should be sufficient to estimate the intensity of the flow of outliers with acceptable accuracy.

Author Contributions

Conceptualization, all; methodology, Y.A.W.S., A.T., V.K.; software, A.T.; validation, A.T., Y.A.W.S.; formal analysis, all; resources, F.Y., V.K.; writing—original draft preparation, Y.A.W.S., A.T.; writing—review and editing, all; funding acquisition, F.Y., V.K., Y.A.W.S. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by RFBR and NSFC (grant numbers 21-57-53005 and 62111530057) and National Science and Technology Innovation 2030 Major Project (grant No.2018AAA0101604) of the Ministry of Science and Technology of China.

Data Availability Statement

Data can be obtained by contacting the authors.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Shardt, Y.A.W. Data Quality Assessment for Closed-Loop System Identification and Forecasting with Application to Soft Sensors; University of Alberta Press: Ed-monton, AB, Canada, 2012; Available online: https://era.library.ualberta.ca/items/8382f12a-8960-4508-9ede-0679e021394b (accessed on 2 January 2021).
  2. Bakirov, R.; Babrys, B.; Fay, D. Multiple adaptive mechanisms for data-driven soft sensors. Comput. Chem. Eng. 2017, 96, 42–54. [Google Scholar] [CrossRef] [Green Version]
  3. Funatsu, K. Process control and soft sensors. In T. Engel, & J. Gasteiger, Applied Chemoinformatics: Achievements and Future Opportunities; Wiley-VCH: Weinheim, Germany, 2018; pp. 571–584. [Google Scholar]
  4. Kim, S.; Kano, M.; Hasebe, S.; Takimi, A.; Seki, T. Long-term industrial applications of inferential control based on just-in-time soft-sensors: Economical impact and challenges. Ind. Eng. Chem. Res. 2013, 52, 12346–12356. [Google Scholar] [CrossRef]
  5. Torgashov, A.; Skogestad, S. The use of first principles model for evaluation of adaptive soft sensor for multicomponent distillation unit. Chem. Eng. Res. Des. 2019, 151, 70–78. [Google Scholar] [CrossRef]
  6. Griesing-Scheiwe, F.; Shardt, Y.A.; Pérez-Zuñiga, G.; Yang, X. Soft Sensor Design for Restricted Variable Sampling Time. J. Process. Control. 2020, 92, 310–318. [Google Scholar] [CrossRef]
  7. Klimchenko, V.V.; Samotylova, S.A.; Torgashov, A.Y. Feedback in a predictive model of a reactive distillation process. J. Comput. Syst. Sci. Int. 2019, 58, 637–647. [Google Scholar] [CrossRef]
  8. Box, G.E.; Jenkins, G.M.; Reinsel, G.C.; Ljung, G.M. Time Series Analysis: Forecasting and Control; Wiley: Hoboken, NJ, USA, 2016. [Google Scholar]
  9. Ljung, L. System Identification; Prentice Hall: Englewood Cliffs, NJ, USA, 1987. [Google Scholar]
  10. Shardt, Y.A.W. Statistics for Chemical and Process Engineers; Springer: Berlin/Heidelberg, Germany, 2015. [Google Scholar]
  11. Klimchenko, V.V. Decomposition of the multi-dimensional time series identification problem. Autom. Remote. Control. 2008, 69, 845–857. [Google Scholar] [CrossRef]
  12. Anderson, T.W. An Introduction to Multivariate Statistical Analysis, 3rd ed.; John Wiley: New York, NY, USA, 2003. [Google Scholar]
  13. Brillinger, D.R. Time Series: Data Analysis and Theory; SIAM: Philadelphia, PA, USA, 2001. [Google Scholar]
  14. Marsaglia, G.; Tsang, W.W.; Wang, J. Evaluating Kolmogorov’s Distribution. Stat. Softw. 2003, 8, 1–4. [Google Scholar] [CrossRef]
  15. Hoff, J.C. A Practical Guide to Box-Jenkins Forecasting; Lifetime Learning Publications: Belmont, CA, USA, 1983. [Google Scholar]
  16. Hannan, E.J.; Deistler, M. Statistical Theory of Linear Systems; John Wiley and Sons: New York, NY, USA, 1988. [Google Scholar]
  17. Marple, S.L. Digital Spectral Analysis, 2nd ed.; Courier Dover Publications: Chicago, IL, USA, 2019. [Google Scholar]
  18. Kadlec, P.; Grbic, R.; Gabrys, B. Review of adaptation mechanisms for data-driven soft sensors. Comput. Chem. Eng. 2011, 35, 1–24. [Google Scholar] [CrossRef]
Figure 1. Soft sensor system of interest [1].
Figure 1. Soft sensor system of interest [1].
Mathematics 09 01947 g001
Figure 2. Bias update term as a predictive model with feedback: Mathematics 09 01947 i001—plant, Mathematics 09 01947 i002—predictive model.
Figure 2. Bias update term as a predictive model with feedback: Mathematics 09 01947 i001—plant, Mathematics 09 01947 i002—predictive model.
Mathematics 09 01947 g002
Figure 3. Reactive distillation unit of MTBE production.
Figure 3. Reactive distillation unit of MTBE production.
Mathematics 09 01947 g003
Figure 4. Sample spectrum of the process et, 1.
Figure 4. Sample spectrum of the process et, 1.
Mathematics 09 01947 g004
Figure 5. Sample spectrum of the process et, 3.
Figure 5. Sample spectrum of the process et, 3.
Mathematics 09 01947 g005
Figure 6. Sample spectrum of the process et, 2.
Figure 6. Sample spectrum of the process et, 2.
Mathematics 09 01947 g006
Figure 7. Prediction of the process et, 1.
Figure 7. Prediction of the process et, 1.
Mathematics 09 01947 g007
Figure 8. Prediction of the process et, 2.
Figure 8. Prediction of the process et, 2.
Mathematics 09 01947 g008
Figure 9. Prediction of the process et, 3.
Figure 9. Prediction of the process et, 3.
Mathematics 09 01947 g009
Figure 10. Estimation of ym1.
Figure 10. Estimation of ym1.
Mathematics 09 01947 g010
Figure 11. Estimation of ym2.
Figure 11. Estimation of ym2.
Mathematics 09 01947 g011
Figure 12. Estimation of ym3.
Figure 12. Estimation of ym3.
Mathematics 09 01947 g012
Table 1. Soft sensor input and output variables.
Table 1. Soft sensor input and output variables.
Description of Process Variable NotationSS Variable
Feed flowrate, m3/sFIR−1x1
MeOH flowrate to Rx, m3/sFIR-2x2
Reflux flowrate, m3/sFIR-3x3
MeOH flowrate to P-Rx, m3/sFIR-4x4
Bottoms flowrate from Rx, m3/sFIR-5x5
Bottom pressure, MPaPIR−1x6
Temperature in P-Rx, KTIR−1x7
Temperature in Rx, KTIR-2x8
Bottom temperature, KTIR-3x9
Vapor flow temp. from C − 1, KTIR-4x10
Concentration of MSBE in MTBE, wt.%-y1
Concentration of MeOH in MTBE, wt.%-y2
Concentration of DIME in MTBE, wt.%-y3
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Klimchenko, V.; Torgashov, A.; Shardt, Y.A.W.; Yang, F. Multi-Output Soft Sensor with a Multivariate Filter That Predicts Errors Applied to an Industrial Reactive Distillation Process. Mathematics 2021, 9, 1947. https://doi.org/10.3390/math9161947

AMA Style

Klimchenko V, Torgashov A, Shardt YAW, Yang F. Multi-Output Soft Sensor with a Multivariate Filter That Predicts Errors Applied to an Industrial Reactive Distillation Process. Mathematics. 2021; 9(16):1947. https://doi.org/10.3390/math9161947

Chicago/Turabian Style

Klimchenko, Vladimir, Andrei Torgashov, Yuri A. W. Shardt, and Fan Yang. 2021. "Multi-Output Soft Sensor with a Multivariate Filter That Predicts Errors Applied to an Industrial Reactive Distillation Process" Mathematics 9, no. 16: 1947. https://doi.org/10.3390/math9161947

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop