Dynamic Soft Sensor Development for Time-Varying and Multirate Data Processes Based on Discount and Weighted ARMA Models

: To solve the soft sensor modeling (SSMI) problem in a nonlinear chemical process with dynamic time variation and multi-rate data, this paper proposes a dynamic SSMI method based on an autoregressive moving average (ARMA) model of weighted process data with discount (DSSMI-AMWPDD) and optimization methods. For the sustained influence of auxiliary variable data on the dominant variables, the ARMA model structure is adopted. To reduce the complexity of the model, the dynamic weighting model is combined with the ARMA model. To address the weights of auxiliary variable data with different sampling frequencies, a calculation method for AMWPDD is proposed using assumptions that are suitable for most sequential chemical processes. The proposed method can obtain a discount factor value (DFV) of auxiliary variable data, realizing the dynamic fusion of chemical process data. Particle swarm optimization (PSO) is employed to optimize the soft sensor model parameters. To address the poor convergence problem of PSO, ω -dynamic PSO ( ω DPSO) is used to improve the PSO convergence via the dynamic fluctuation of the inertia weight. A continuous stirred tank reactor (CSTR) simulation experiment was performed. The results show that the proposed DSSMI-AMWPDD method can effectively improve the SSM prediction accuracy for a nonlinear time-varying chemical process. The AMWPDD proposed in this paper can reflect the dynamic change of chemical process and improve the accuracy of SSM data prediction. The ω dynamic PSO method proposed in this paper has faster convergence speed and higher convergence accuracy, thus, these models correlate with the concept of symmetry.


Introduction
In order to reflect the dynamic change of chemical process, this paper attempts to propose DSSMI-AMWPDD method, which can effectively improve the prediction accuracy of nonlinear time-varying chemical process SSM. The proposed model and the concept of symmetry have relativity and complementarity, and the research direction is highly consistent with Symmetry, which is convenient for scholars in related fields as a reference. In chemical production, major process variables such as product quality are characterized by a slow sampling rate and time delay [1]. To ensure the stability of variable data in the main process, it is necessary to estimate the main process variables through some easily acquired process variables. Therefore, soft sensor modeling (SSMI) is of great significance. Since most chemical processes do not have clear principles but have strong nonlinear and dynamic time-varying characteristics, the use of data-driven methods to establish an industrial soft sensor model (SSM) [2][3][4] has become the focus of research. In particular, how to establish a suitable nonlinear dynamic model has evolved into an important research object for researchers.
Related modeling methods are generally divided into four types: multipoint input modeling [5][6][7], dynamic weighting modeling [8,9], feedback network modeling [10,11], and multimodel structure modeling [12][13][14]. Among these types of methods, multipoint input modeling boasts the advantages of simplicity, ease in implementation, and full reflection of the process characteristics. However, to fully reflect the dynamic characteristics of the process, a large number of high-dimensional input variables are needed, which increases the number and complexity of the internal parameters, resulting in ill-conditioned models. Dynamic weighting modeling uses dynamic weighting to form new input variables, which reduces the model input nodes and lower the model complexity, making the method simple and easy. However, the dynamic weights and historical data (HD) lengths are difficult to determine. Feedback network modeling updates the input through the delay link of the feedback loop and updates the structure or structural parameters, approximate to the object function. However, the model has poor stability, large deviations, no convergence, and an inability to fully reflect the dynamic information of the process; in addition, the model is not commonly used as it has a complex training process. A representative multimodel structure modeling is the Wiener structure, in which linear dynamic and nonlinear static submodels are built to describe the dynamic characteristics of a system. It provides a good approximation, but the dual model architecture is complex and difficult to identify.
The use of the above methods leads to inaccurate SSM data prediction, either because the sample data for modeling do not fully reflect the dynamic characteristics of the process or because it is difficult to perform the modeling and determine the parameters. To improve the prediction accuracy of SSM data and simplify the dynamic soft sensor modeling structure, this paper proposes an autoregressive moving average (ARMA) model of weighted process data with discount (AMWPDD) structure, which has better flexibility in actual time series data fitting [15] and is simple and easy to implement. Amid assumptions suitable for most sequential chemical processes, the discount factor (DF) is introduced for the auxiliary variable HD of a chemical process. Additionally, a DF calculation method and the corresponding constraints are proposed. Weighting is assigned to the auxiliary variable HD of the chemical process through the calculated DFV. The auxiliary variable HD of the chemical process is fused to resolve the problem that the weight for the HD is not easy to determine. The calculation method of the sum of DFVs being "1" reflects the integrity of the auxiliary variable HD. The problem of determining the weights of the auxiliary variable HD of different lengths can be solved by the exponential addition of DFVs according to the length of the auxiliary variable HD. Therefore, the dynamic fusion of the process data is realized, the sample data quality of the SSM modeling is enhanced, and the SSM prediction accuracy is improved. As chemical processes often present significant dynamics and delays [11], the study of DSSMI-AMWPDD is important.
A least squares support vector machine (LSSVM) has been proposed to deal with data regression tasks, and its success has been demonstrated in some supervised learning cases [16]. However, the LSSVM parameters affect the training performance of the model and they are difficult to determine, an intelligent optimization algorithm is generally used to realize the nonlinear robust identification of the LSSVM [17]. Particle swarm optimization (PSO) [18] has attracted much attention because of its easy implementation and few adjustment parameters [19][20][21][22][23]. However, PSO may be trapped in local optima, and the convergence performance is very weak in later iterations [24]. To address the above problems, this paper proposes an improved particle swarm method by realizing the periodic fluctuation of the inertia weight values through the dynamic adjustment of the inertia weight, improving the search precision and the convergence of the algorithm.

Problem Statement
The chemical process is a sequential production. Generally, the continuous sampling of input variables can be obtained, with only the sparse sampling of the output variables being gained [25].
Only a small number of process parameters are used, while past data with a large amount of dynamic information are ignored [12], as shown in Figure 1. The problem to be studied and solved in this paper is how to rationally integrate the dynamic characteristics of a chemical process into SSMI data, convert dynamic process data into static data, and establish an SSM and optimize the model parameters to achieve the purpose of the SSM prediction accuracy improvement.

AMWPDD Sample Data Processing
Adding a large number of historical inputs not only increases the number and complexity of the parameters of the model but also causes the ill conditioning of the model due to the excessively high dimensionality of the input variables.
In consideration of the sustained influence of the auxiliary variable data on the dominant variables in the chemical process and the corresponding difficulty in determining the degree of such influence, and to reduce the number of input nodes and lower the complexity of the model, the ARMA model structure is used to establish the input data vector using historical inputs and historical outputs.
The conventional , model is as follows [7]: in which p and q are the orders of the model, φ is the operator of the AR of order p, and θ is the operator of MA of order q.
The multipoint input ARMA model used in this paper is shown in Figure 2: The generalized function of the multipoint input ARMA model shown in Figure 2 is expressed as:

y(k) … y(t k-1 ) u(t k-T ) u(t k-1 )
Adding the output of the previous batch as an effective input into the modeled sample data can reduce the number of historical input nodes, reducing the number of model parameters and the complexity of the model [26], and lowering the possibility of morbidity in the model. However, the multipoint input ARMA model shown in Figure 2 still has an incomplete model structure induced by numerous historical input nodes and the high model complexity.
To solve these problems, we combine the dynamic weighting model [27] with the ARMA model to realize the dynamic weighted fusion of the input nodes and hence reflect the dynamic characteristics of the process, as shown in Figure 3: Static Model Its generalized function is expressed as: The dynamic weight directly reflects the dynamic characteristics of the process and affects the accuracy of the overall model. However, it is difficult to obtain accurate values of the dynamic weights. To solve this problem, this paper proposes an assumed condition suitable for most sequential chemical processes after a detailed site investigation of the chemical process.
If the value of the input node changes in different time periods, the further away an input point is from the output time, the less influence the value of the input node has on the change in the value of the output node.
The above assumption is explained as follows: where is the degree of influence in the input node value. Based on the above assumptions, this paper introduces DF λ and uses the discount method [28,29] to dynamically discount-weigh the auxiliary variable values of different sampling time points in the same batch. By doing this, the impact of recent sample data on the model is enhanced, the role of past samples is reduced, the fusion of input node values from tk to tk+T is achieved, and the problem that the weight of the dynamic weighted model is difficult to judge is solved.
The structure of the AMWPDD proposed in this paper is shown in Figure 4:

… y(t k-1 ) u(t k-T ) u(t k )
Nonlinear Static Model Its generalized function is as follows: where is the process input data obtained by DF weighted fusion. However, the transition time T of the input variable of the chemical process, that is, the HD length, is difficult to determine. To ensure the data integrity of the batch input and output variables, this paper proposes a calculation method of DF λ in combination with the above assumption.
DF λ numerical constraints: The calculation formula of DF λ is: One can obtain the dynamic value of DF , = 1,2, ⋯ , based on different transition times T by Equations (10) and (11), realize the dynamic calculation of the DF λ value, and obtain more accurate data fusion weights.
Then, the sample data set S for SSM modeling is: where is the j input variables at time , is the output variable at time , − 1 is the output variable at time − 1, and = 1,2, ⋯ , indicates the sampling time at which the system outputs M sample points.
On the basis of the sample data set S for the SSM, the SSM is established by the SSMI method.

LSSVM-Based SSMI
The least squares support vector machine (LSSVM) is a machine learning method proposed by Suykens [30] for solving the function estimation. It has better calculation speed, convergence precision and generalization performance and is more suitable for the small sample data SSMI of chemical processes [31].
The LSSVM model [32] is: where · is a nonlinear transformation function, ω is an adjustable weight vector, and is an offset.
The objective function of LSSVM [33] is as follows: .
where ∈ refers to the input vector, ∈ represents the corresponding output vector, is the difference between the system output value and actual value, ≥ 0 represents a regularization parameter used to minimize the estimation error and control the function smoothness, · refers to a nonlinear mapping from the input space to the feature space, ω is the system weight, is an offset, and . . indicates a constraint condition.
The Lagrange polynomial function of the optimization is solved by the Karush-Kuhn-Tucher (KKT) condition, and the LSSVM model for function estimation can be expressed as: Since the radial basis kernel function is a kernel function that has been widely used [34], it is chosen in this paper: After the kernel function is determined, the error term penalizes the parameter C, and the kernel function parameter σ 2 affects the regression performance of the LSSVM method. However, they are difficult to determine. To ensure the optimal regression performance of the LSSVM, this paper uses PSO to optimize the error term penalty parameter C and the kernel function parameter σ 2 .

Model Parameter Optimization Based on ωDPSO
To improve the prediction accuracy of SSM, the optimization objective is set to minimize the sum of the squared errors between the sample actual output data and sample prediction output . The optimization objective function is as follows: PSO [18] is an intelligent algorithm that simulates the predation behavior of birds and fish groups.
In PSO, each particle has independent position, velocity, and fitness for optimizing the target. PSO randomly sets a certain number of particles, initializes their positions and speeds, and completes the optimization process by iteration. The iterative process generally includes two optimal values: the Personal Best Value (pbest) and the Global Best Value (gbest); pbest represents the optimal fitness of the particle itself, and gbest refers to the optimal fitness of all particles. The particle updates its position and velocity by tracking two extreme values (pbest, gbest), and the updated formula is as follows: v t + 1 = w × v t + c × rand() × p -x (t) + c × rand() × p -x (t) (19) ( + 1) = ( ) + ( + 1) where vid, xid, and pid represent the velocity, position, and personal best value, respectively, of particle i in iteration t; rand() is a random number between [0, 1]; c1 and c2 are learning factors, which represent the weights of the statistical acceleration terms that push each particle to the pbest and gbest locations; ω is the inertia weight; t is the number of iterations. Since the inertia weight is related to the development and exploration ability of the particle, it affects the convergence of the algorithm [35,36]. A larger ω value is beneficial to jump out of a local optimum for global optimization; a smaller ω value is beneficial to local optimization and accelerates the convergence of the algorithm. When the search process follows a nonlinear and highly complex algorithm, the linearly decreasing inertia weight does not effectively reflect the actual search process [37]. At the same time, the inertia weight linear decreasing strategy based on the number of iterations has a weak partial search in the early iteration, and the particle might miss the optimal value even if it is close to the current value; in late iterations, the global search ability is weak, and it is easy to fall into the local optimum problem. In this paper, an ωDPSO is proposed, which improves the convergence of the algorithm by dynamically adjusting the inertia weight ω value, as shown in Figure 5: The equation for the inertia weight is expressed as: As shown in Figure 5, compared with the inertia weight linear decreasing strategy based on the number of iterations, the inertia weight ω values proposed in this paper vary from large to small, then from small to large, and again from large to small, showing a periodic sawtooth fluctuation. As a result, the particles are periodically alternating between focusing on the global search and focusing on the local search. This balances the global search and the local search, avoids being trapped in a local optimum, improving the convergence of the PSO algorithm. At the same time, according to the ratio of the current iteration number to the total number of iterations, the ratio of the time prior to the peak within the cycle to the cycle time is adjusted so that the inertia weight ω increases rapidly and decreases slowly in early numerical iterations, grows and reduces slowly in the middle iterations, and rises slowly and decreases rapidly in the late iterations.
Through ωDPSO, the LSSVM SSMI method penalty parameter C and the kernel function parameter σ 2 are numerically optimized.

Simulation and Analysis
A CSTR is used as study objects to test the predictive performance of the SSMI method proposed in this paper for nonlinear dynamic time-varying chemical processes. The CSTR data comes from a computer simulation. In addition, to evaluate the performance of the proposed method, the dynamic and static data LSSVM SSM, PSO-LSSVM SSM, and ωDPSO-LSSVM soft sensor model of the CSTR object are established.
Given the commonality and universality of evaluation indices, including the mean absolute error (MAE), root mean square error (RMSE), and running time (RT), in regression analyses, this paper uses the RMSE in Equation (23) and the RT to evaluate the training performance of the SSM and the MAE in Equation (22) and the RMSE in Equation (23) to evaluate the prediction accuracy of the SSM: where N is the total number of samples, is the predicted output value, and is the actual value.

CSTR Simulation Experiment and Result Analysis
The continuous stirred tank reactor is the most important piece of equipment in many chemical and biochemical industries and has second-order nonlinear dynamic characteristics [38]. Therefore, it can be used to test the ability of the SSMI method to solve nonlinear and time-varying problems.
The principle of the CSTR [39] is shown in Figure 6, with a description of each variable and the values of the steady-state operating points shown in Table 1 [40]. The concentration CA of the raw material A in the reactor is considered to be the dominant variable of the SSM. The feed flow rate Fi, the cooling water flow rate Fc, and the reactor internal temperature Tr are treated as auxiliary variables of the SSM.

Cpc
Cooling water specific heat capacity 1 cal/g/k In the CSTR simulation process, the sampling periods of the auxiliary and dominant variables are set to 1 hour and 12 hour, respectively. The simulation time is set to 265 hour, and a certain white Gaussian noise is added to each auxiliary variable. A total of 265 groups of usable data are obtained, 23 of which are labeled and the rest are dynamically unlabeled. The former 168 groups of data are used as the training sample set, and the latter 96 groups of data are used as the test sample set. A total of 22 groups of dynamic fusion data and static data with the same output are used as the SSMI samples. To simulate the reduction and recovery of the catalyst activity in the reactor, the catalyst activity k0 is set for the data simulation based on the variation pattern in Figure 7, and the simulated data set is then normalized. The following model structure based on static data modeling is used: With expert knowledge and the dynamic characteristics of the CSTR, the following model structure is adopted for dynamic fusion data-based modeling:   Table 2, it can be seen that the models trained by PSO-LSSVM and ωDPSO-LSSVM are closer to the actual data. Compared with that of the LSSVM method, the RMSE index increases by 1.35% and 1.57%, respectively, and the RT index falls by factors of 30.16 and 25.73, respectively.
Compared to those of the PSO-LSSVM method, the RMSE and RT indices of the ωDPSO-LSSVM method rises by 0.23% and 14.24%, respectively.
The training performances of the different SSMI methods are shown in Table 2.  Figure 9. Compared to those of the PSO-LSSVM method, the MAE and RMSE values of the ωDPSO-LSSVM method increase by 1.6% and 1.9%, respectively, indicating a higher prediction accuracy.
The prediction performances of the different SSMI methods are shown in Table 3.  Figure 10. As shown in Figure 10, the different SSMI methods have different training effects on the dynamic fusion data. Combined with the RMSE values of the models built by the different soft sensor methods, as listed in Table 4, it found that, compared with those of the LSSVM method, the RMSE indices of the PSO-LSSVM and ωDPSO-LSSVM methods rise by 3.81% and 4.11%, respectively, and their RT indices decrease by factors of 35.49 and 26.92, respectively.
Compared with those of the PSO-LSSVM method, the RMSE and RT indices of the ωDPSO-LSSVM method reduce by 0.31% and 6.8%, respectively.
The training performances of the different SSMI methods are shown in Table 4:  Figure 11.  Figure 11 shows that the prediction curves of the PSO-LSSVM and ωDPSO-LSSVM methods are closer to the actual values. As seen in Table 5, compared with the those of the LSSM method, the MAE values of the PSO-LSSVM and ωDPSO-LSSVM methods grow by 1.51% and 3.4%, respectively, and their RMSE values rise by 4.83% and 7.47%, respectively. Compared with those of the PSO-LSSVM method, the MAE and RMSE values of the ωDPSO-LSSVM method increase by 1.92% and 2.77%, respectively.
The prediction performances of the different SSMI methods are shown in Table 5.  22.53%, and 23.21%, respectively. The models established via dynamic fusion data and the corresponding data prediction accuracy are better than those using static data. The main reason is that the chemical process is a continuous time series production process, and changes in the values of the auxiliary variables affect the values of the subsequent dominant variables. Modeling based only on the current static data is unable to reflect the process variation of the auxiliary variables, resulting in the poor training of the model and low precision of the data prediction. In view of the influence of the inertia weight coefficient ω on the convergence performance of the PSO method, the ωDPSO-LSSVM method achieves better prediction performance than the PSO-LSSVM method.

Simulation Experiment and Result Analysis
In section 5.1, the simulation data of CSTR is used to experimentally verify the proposed DSSMI-AMWPDD method based on ωDPSO. A comparison of the modeling using dynamic fusion data and static data, as well as the experimental results of data prediction, proves that the SSM established by dynamic fusion data is superior to that those using static data in terms of the prediction model accuracy and data prediction precision. Additionally, this paper adopts the PSO-LSSVM method and the ωDPSO-LSSVM method to perform 10 trainings on CSTR data and selects the one with the best training effect as the experimental result, and the results show that the ωDPSO-LSSVM method achieves better prediction performance, shorter training time, and stronger convergence.

Conclusions
In this work, based on chemical processes as the research setting, the simulation modeling of CSTR simulation data shows that the AMWPDD proposed in this paper can reflect the dynamic changes of the chemical process and improve the accuracy of the SSM data prediction. Furthermore, the simulation results show that, compared with the standard PSO method, the ωDPSO method can better balance the local and global development capabilities, with faster convergence speed and higher convergence accuracy.