Numerical Aspects of Data Reconciliation in Industrial Applications

: Data reconciliation is a model-based technique that reduces measurement errors by making use of redundancies in process data. It is largely applied in modern process industries, being commercially available in software tools. Based on industrial applications reported in the literature, we have identiﬁed and tested different conﬁguration settings providing a numerical assessment on the performance of several important aspects involved in the solution of nonlinear steady-state data reconciliation that are generally overlooked. The discussed items are comprised of problem formulation, regarding the presence of estimated parameters in the objective function; solution approach when applying nonlinear programming solvers; methods for estimating objective function gradients; initial guess; and optimization algorithm. The study is based on simulations of a rigorous and validated model of a real offshore oil production system. The assessment includes evaluations of solution robustness, constraint violation at convergence, and computational cost. In addition, we propose the use of a global test to detect inconsistencies in the formulation and in the solution of the problem. Results show that different settings have a great impact on the performance of reconciliation procedures, often leading to local solutions. The question of how to satisfactorily solve the data reconciliation problem is discussed so as to obtain improved estimates.


Introduction
Reliable data is essential to any industrial process.From process operating to planning and scheduling, it is used to support decisions that affect product quality, plant profitability, and plant safety.It has implications in environmental and legal issues, such as pollution monitoring by state agencies.In production accounting, it is involved in the quantification of significant issues, such as trade and custody transfer and process inventories.However, especially in industrial environments, obtaining reliable information during daily operation is not a trivial task.
Measurements of process variables, both on-and off-line, are subject to random errors (noise), such as natural disturbances, sampling errors, unreliable instrument readouts and laboratory analyses inaccuracies, and nonrandom errors, such as sensor bias or sensor failure (gross errors) [1].Random noise may also be the outcome of other effects, such as high-frequency pick-up, low resolution, and errors in transmission and conversion (including A/D and D/A conversion) [2], usually attributed to the irreproducibility of the measurement devices [3].Consequently, the collected data generally do not even satisfy basic process constraints, such as the mass and energy balances.Inaccurate process data could result in misleading or inaccurate conclusions, thus leading to poor decisions that can adversely affect many plant functions.Thus, to reduce the impacts of measurement errors in plant measurements and to increase the value of data accessible through implemented data management systems, data rectification should be employed [4].
Data rectification is a collection of techniques for correcting data and is comprised of different steps, such as variable classification, gross error detection, parameter estimation, and data reconciliation [5].At its core, data reconciliation (DR) is a model-based technique that reduces measurement errors by making use of redundancies in process data.It aims at estimating the true state of the plant based on the adjustment of process measurements in order to satisfy a set of constraints, which is accomplished by minimizing some sort of deviation between corrected and observed plant values.Unlike other filtering techniques, DR makes explicit use of the process model so that reconciled estimates are consistent with the known relationships between process variables as defined by the constraints, as well as being expected to be more accurate than the measurements [6,7].
Data reconciliation may also be used as an effective monitoring tool, producing estimates for non-measured variables and process parameters.In many systems and applications, parameter estimation is the step after data reconciliation, where reconciled values of the measured variables are used to estimate/update model parameters [8][9][10].Nonetheless, procedures for data reconciliation with simultaneous parameter estimation (DRPE) are equally valid and efficient [11][12][13][14][15].It is worth noting that measurements of independent variables are typically assumed to be free of error.In this context, DR may also be applied when both dependent and independent variables contain errors, which some authors label as an error-in-variables method (EVM) [16,17].EVM provides both parameter estimates and reconciled data estimates that are consistent with respect to the model.Besides, techniques for joint data reconciliation and parameter estimation with simultaneous gross error detection have emerged based on robust statistics, which are especially important for on-line industrial applications [18].Many robust estimators have been proposed to reduce the effect of gross errors and yield less biased estimates [19][20][21][22][23]. Alternative approaches to gross error detection involve the use of principal component analysis [24,25], cluster analysis [26], artificial neural networks [27], and robust [28] and correntropy [29] estimators.More recently, a practical DRPE methodology for process systems with multi-operating conditions along with a correntropy estimator [30] and a unified framework for applying principal component analysis (PCA) and DR [25] have been presented.
The literature regarding data reconciliation is comprised of a vast number of works and good compilations of the techniques available and can be found in the open literature [28,31,32] as well as the comprehensive books [5,[33][34][35][36].Beginning with steady-state problems with linear constraints for the case when all variables are measured [37], advances in optimization techniques have allowed more complex formulations addressing steady-state nonlinear [19,[38][39][40][41], dynamic linear [42][43][44] and dynamic nonlinear [15,26,[45][46][47] data reconciliation problems.In each of these formulations, the process model, represented by the (differential) algebraic equation system, is a constraint of the mathematical programming model representing the estimation problem, where the solution of the system's equations is required to solve the optimization problem.
Largely applied in most modern process industries, DR is commercially available in software tools such as Aspen Advisor (Aspen Technology, Inc., Bedford, MA, USA), VALI (Belsim, Awans, Belgium), VisualMesa (Soteica, Houston, TX, USA), SimSci DATACON (Schneider Electric Software, Lake Forest, CA, USA, LLC), and Sigmafine (OSIsoft, San Leandro, CA, USA), among others [48].In addition to in-house systems [49][50][51], these tools can be used to reconcile flow, temperature and composition measurements to satisfy material and energy balances around each unit in a process plant, as well as to estimate parameters [52][53][54].Furthermore, the increasing availability and decreasing cost of high performance computers have allowed the implementation of more robust and complex applications, where reconciled measurements are used in on-line applications such as performance monitoring, process control, and real-time optimization [55][56][57].
Given the importance of DR techniques, its expressive body of knowledge and its availability in commercial tools, detailed descriptions of data reconciliation procedures and its use in actual industrial applications are relatively scarce in the open literature.This apparent discrepancy may be partially attributed to proprietary reasons and confidential constraints.Works related to industrial DR implementations are mainly concerned with the challenges in obtaining a precise mathematical representation of the process given the information available.They typically concentrate on developing procedures and computation strategies to new systems, which most often involve formulating the DR problem based on a rigorous process model and solving it using a typical optimization algorithm suitable for high-dimensional mathematical programming problems, as well as more robust gross error detection schemes.Although they are of undeniable technical value, the implementation and advance of on-line DR applications require the choice of the most suitable implementation setting.
Indeed, the availability of a precise process model and a suitable set of measured variables are not enough for adequately solving the DR problem.Few studies present more detailed analyses for settings regarding distinct areas, such as problem formulation [4,28], solution method [12,58,59] and approach [60,61], optimization algorithm [54,[62][63][64], and solving strategies [23].Distinct settings, even those represented by minor differences, may present different performances but commonly are not discussed.Nevertheless, they should be analyzed during the design of a DR application trying to answer whether the optimization problem is being satisfactory solved.Although DR generates benefits, it can have strong negative effects when some hypothesis is violated or the procedure does not find a feasible solution, and using reconciled data could yield worse results than using the measurements directly [65][66][67].
In this work, we aim at providing a numerical assessment on the performance of different settings for solving nonlinear steady-state data reconciliation.We show that different settings have a great impact on the performance of DR procedures and that an adequate setting must be found.Problem formulation, regarding the presence of estimated parameters in the objective function, and thereby solution approach, either sequential or simultaneous, are considered when applying nonlinear programming (NLP) solvers as the solution method.We devote attention to investigate the impact of methods for estimating objective function gradients required by deterministic optimization algorithms, which is generally overlooked in DR studies and applications.In this work, the interior-point (IP) algorithm is applied.The performance of a hybrid algorithm combining the metaheuristic algorithm particle swarm optimization (PSO) and IP is also evaluated and compared.As a known variable affecting the performance of optimization algorithms, we analyze the influence of the initial guess in the solution of the DR problem.The assessment includes evaluations of solution robustness, constraint violation at convergence, and computational speed.In order to better control the results, only simulation data is used in the study and gross error detection is not considered in the scope of this work.The study is based on simulations of a rigorous and validated model of a real offshore oil production system.The process model is available from a reliable proprietary simulator largely used at Petrobras.It is important to emphasize that very few works focus on the use of data reconciliation procedures on oil production systems [2].
The remainder of this article is organized as follows.Section 2 presents a brief overview of data reconciliation in industrial applications.Section 3 describes the offshore oil and water production systems for which the study is based on, and also details the simulation experiments conducted for assessing different settings for solving nonlinear DR.The results are presented and discussed in Section 4, and finally this work is concluded in Section 5.

Problem Formulation
The essence of data reconciliation is that, given the measurements z ms from the plant, we want to estimate the process state, which satisfies the known relationships between process variables z as defined by the (first principles) process model.The actual estimated condition ẑ of the process is obtained as the solution of ẑ := arg min where J DR (•) is the objective function and g(•) is the function vector representing the known inequality process constraints, where knowledge of the parameters θ ∈ R n θ is assumed .The vector of measured process variables z m ∈ R n zm consists of output process variables (y ∈ R n y ), but may also include process inputs (u ∈ R n u ), and is defined by the measurement mapping matrix A m , where A m (j, i) = 1 if variable j is measured and the measured value is located in Depending on the nature of the process model equations, the DR problem may be either linear or nonlinear, and either static or dynamic.As shown in Table 1, most of the applications found are related to steady-state nonlinear data reconciliation.In this context, considering nonlinear steady-state DR problems, process model equations are conveniently represented by An important part of DR problem formulation is the definition of the objective function J DR .It might be based on a deterministic approach following classical regression techniques [28], being defined, for example, as a weighted least-squares estimator (WLS) [110]: where Q is the weighting matrix.
Another approach to formulate the problem and define the objective function is based on a probabilistic framework utilizing the maximum likelihood method, for which the measurements of z m are assumed to contain random errors so that where ε is the vector of measurement errors.The advantage of applying the maximum likelihood method is that it gives efficient (or minimum variance bound) and unbiased estimates, which means that estimates have the lowest theoretically attainable variance, and an expected value equal to the true value of the variable, respectively [111].In this method, the likelihood function (J L ) corresponds to the joint probability distribution of measured variables as a function of the estimated variables/parameters. Thus, assuming measured variables are statistically independent, the objective function is the product of the individual probability density functions p of the measured variables, i.e., where σ 2 i is the variance of variable i.The maximum likelihood estimate is the value for which the likelihood function attains its maximum while satisfying equality and inequality constraints.Nonetheless, both regression and maximum likelihood approaches are closely related when the measurement error is uncorrelated among different measurements and is normally distributed with zero mean and positive definite covariance matrix V z , ε ∼ N(0, V z ).For such a case, the likelihood function in Equation ( 4) may be simplified to a quadratic function, so that the maximization of this simplified function is equivalent to minimizing Equation ( 2) with Q = V z [112,113].Formulations of the DR problem typically follow Equation (1) with a WLS estimator supported by the (implicit) assumption of independent and normally distributed measurement errors, resulting in the following constrained nonlinear program Equation ( 5) is easily extended to a DRPE problem by the inclusion of process parameters θ (or a subset of it) in the set of decision variables.
To deal with deviations from this conventional normal assumption, as is the case when data sets contain gross errors and outliers, gross error detection (GED) schemes have been proposed, such as the simultaneous approach for data reconciliation and gross error detection.This technique is based on robust statistics and consists on finding estimators (i.e., an objective function) whose mathematical nature, especially its influence function, is insensitive to deviations from the ideal assumptions on errors, giving less weight to the contribution of large errors in the estimation problem [4,21].The expected result is a reduced effect of the eventual gross errors present, yielding less biased estimates.However, the use of robust estimators do not guarantee efficient and unbiased estimates since they will still be dependent on the distribution structure of the measurement errors according to maximum likelihood principles.In this context, a generalized robust data reconciliation approach has been proposed [114].
Even with advanced alternatives, the predominant estimator among industrial applications for plant-wise use is the WLS, where weights either follow the maximum likelihood method or are user defined according to some (heuristic) rule following the regression approach.In the former, the statistical framework for DR still requires the knowledge of the error covariance matrix V z , which depends on several aspects including sensor characteristics, amplifier, cable links, and digitizing circuits.Usually the variances σ 2 i are unknown but can be estimated by historical data, assuming V z = diag σ 2 1 , σ 2 2 , . . ., σ 2 n zm .Few cases do not follow the maximum likelihood method when defining the estimator.In the reconciliation phase of the On-line Reconciliation and Optimization (ORO) package described by Bussani et al. [71], DR is achieved by minimizing the weighted sum of squared residuals between the measured and calculated variable values normalized by measured values.In addition, some applications formulate the DRPE problem considering process parameters along with measured variables in the objective function.In an application for the gas separation system of a petroleum refinery, Piccolo et al. [13] included in the objective function estimated efficiencies of both de-ethanizer and debutanizer columns and debutanizer reboiler duty, which were determined in the same DRPE problem as estimated parameters.Similarly, Câmara et al. [67] reported the presence of a load characterization parameter in the objective function for the DRPE of a crude oil distillation unit.

Solution Methods
Different solution methods have been employed in industrial DR applications.The methods are problem-dependent and often exploit some features of the problem in order to achieve fast and efficient calculations.
When the resulting mathematical programming problem is a constrained least squares estimation with either linear or bilinear model constraints, analytical solutions and specific iterative procedures have been used [56,69,74,85].Nonetheless, as more relations apart from overall material balance equations are required for rigorous process models, as well as variable bounds and feasibility constraints, the use of such methodologies is limited.In addition, the development of analytical solutions for linear and bilinear problems in the context of real applications could also be motivated by computational and programming limitations, which has changed remarkably with the increased availability and reduced cost of high performance computers.
In the absence of inequality constraints g(•), the solution of nonlinear DR problems have been achieved by the use of the Lagrange multipliers method [59,70,86], where the partial derivatives for the Lagrangian of the DR problem are set to zero and the resulting system of equations is solved by any simultaneous equation solver, or by the method of successive linearization [12,58,59,76].In the later, linear approximations to the nonlinear constraints are obtained by a Taylor series expansion about the previous iterate and a series of linear data reconciliation problems is solved until a solution satisfying the nonlinear constraints is obtained.Each intermediate solution is the optimal point for the linear constraints.Successive linearization provides relative simplicity and fast calculation, although it cannot handle variable bounds and it may fail to converge [36,58].
The most widely used method for solving nonlinear DR problems, such as that of Equation ( 5), consists of the application of well established, general purpose NLP algorithms.Indeed, several works have demonstrated that using NLP instead of successive linearization remarkably improved reconciliation results [115,116], so that nonlinear programming techniques have emerged as the most efficient.This method allows a general nonlinear objective function, such as robust estimators, and can explicitly handle inequality constraints.Inequality constraints, such as variable bounds, are generally required in the DR model to guarantee feasible estimates, e.g., nonnegative estimates for flows or compositions.Furthermore, in addition to steady-state DR, methods that apply NLP solvers are also used in the solution of dynamic optimization problems, as those resulted from nonlinear dynamic DR [15,83].Thus, the NLP method is able to tackle a complete formulation of the data reconciliation.
Islam et al. [58] tested the three methods in a data reconciliation package for an industrial pyrolysis reactor.However, the discussion is limited to the influence of the initial guess in each one.It was observed that initial starting values far from the steady state values caused the successive linearization method to fail, converging to incorrect steady-state values.The NLP method employing successive quadratic programming (SQP), in turn, required a large computational time to get a comparable result but worked well in all cases with upper and lower bounds for all the reconciled variables.The use of SQP in the absence of variable bounds was not reported.Weiss et al. [12] have also evaluated these three methods of solving data reconciliation on an industrial pyrolysis reactor.The authors have observed similar results among the methods, which provided encouragement to the use of successive linearization since the large computational time required by the nonlinear method could not be justified.However, similar results among these methods is not observed in other works.In the report of Christiansen et al. [76] in applying DR to industrial synthesis gas preparation, successive linearization was applied only when the basic material balances were considered.For the use of a more rigorous model including the energy balance, as well as if it is desired to investigate the plant over a period of time, the NLP method was used employing a SQP algorithm with bounds on the independent variables.
Both the Lagrange multiplier and NLP methods were evaluated by Eksteen et al. [86] in the DR for an open arc smelting furnace.Systematic biases have been observed in the results, being larger when the Lagrange multiplier method was used.However, this comparison is not fair enough since the methods were not evaluated for the same DR formulation, where the Lagrange multiplier method solved the minimization of the variance weighted closure residuals of the mass balance equations, and the NLP method solved the minimization of the weighted sum of squared errors.In other work, the Lagrange multipliers and the successive linearization methods were employed for solving nonlinear DR on a coking plant based on a components balance and total flow rates balance [59].It is argued that successive linearization has the advantage of relative simplicity and fast calculation and that, in general, the Lagrange multipliers method tends more frequently to shift the reconciled values in a biased direction.
Even with the increased availability of high performance computers, dealing with the large number of decision variables remains a challenge when using NLP in industrial applications, as well as the nonlinear and non-convex nature of the objective function, process model and other constraints.It has been observed that some NLP algorithms become inefficient when the number of degrees of freedom are large [117,118], and the dimension of the DR problem for large-scale process models can preclude regular optimization techniques from correctly solving the problem [23].As a result, many authors have proposed either decomposition approaches as a solution strategy to overcome this difficulty when employing standard NLP solvers or optimization algorithms tailored for process applications with many degrees of freedom.

Solution Strategies
The most straightforward strategy for solving the nonlinear DR problem is to use nonlinear programming to estimate parameters and reconciled variables simultaneously.However, when the dimension of the problem significantly impairs the use of standard NLP algorithms, decomposition approaches may be used, which divide the original problem in multiple estimation problems of reduced dimensions.
Rod and Hančil [119] proposed an iterative algorithm that separately evaluates parameters and independent variables.Reilly and Patino-Leal [120] used a method in which the data reconciliation is nested within the parameter estimation step.Doví and Paladino [121] presented a constrained variation algorithm in which the dependent variables are eliminated by solving the model equations and thus eliminating the necessity of simultaneously iterating both parameters and true values of the measured variables but requires the second derivatives of the model.Kim et al. [16] compared three strategies for the nonlinear EVM problem, namely, simultaneous parameter estimation and data reconciliation, two-stage EVM, and nested EVM, in terms of computer time and converged parameters and reconciled variable values for test problems from the literature.When evaluating these strategies, a generalized reduced gradient (GRG) and a SQP algorithm were used.As observed in their results, the two-stage nonlinear EVM outperformed the other alternatives and was reported as the only recommended solution strategy.
Regarding industrial DR applications, the predominant strategy is the direct solution of only one nonlinear DR problem, where parameters are estimated along with reconciled variables (DRPE/EVM) when present.Quelhas et al. [66], however, brought attention to the fact that the estimation problem must be carefully designed.Dealing with a two-step RTO applied to an industrial ethylene production plant, it is shown that the typical trajectories of the estimated heat transfer coefficients indicate that some of the model parameters are not estimable when the parameter estimation is performed independently and simultaneously with the data reconciliation procedure, which might be due to correlation with other parameters and process data or because they do not affect the objection function significantly.Other strategies are briefly discussed below.
By establishing that the complexity of large dimension optimization problems can be reduced by performing a partition of the criterion to be optimized, Hodouin and Everell [1] developed a strategy in which the original problem is partitioned into subsystems that are not functions of the same variables.The decoupling is performed such that the subsystems are easy to solve.
In the on-line optimization tool described by Zhang et al. [10], combined gross error detection and data reconciliation, and parameter estimation were achieved separately in a dedicated NLP for each function, however, no reasoning was provided to support this choice.The tool was built using a flowsheeting design program (Aspen Plus), with optimization and user-defined FORTRAN subroutine capabilities for process optimization and parameter estimation.Data reconciliation and gross error detection followed the contaminated-Gaussian estimator [19] implemented in GAMS/MINOS.
Weiss et al. [12] applied a two-stage non-linear EVM, separating the parameter estimation in an outer loop and data reconciliation in an inner loop using the method described by [122].Besides, successive linearization was used to solve the inner loop for data reconciliation.
Lee et al. [81] proposed an on-line data reconciliation and optimization for an industrial utility plant based on a hierarchical decomposition approach, similar to the strategy applied by [1].The strategy decomposes a large and complex system into several subsystems, where it is argued that handling small size subsystems instead of an original large one provides a reduction of computation time, a more robust solution, simplicity of problem formulation and easy maintenance.At an upper level, variables related to nonlinear constraints or shared between two or more subsystems are determined by NLP algorithms.When the values of these variables are determined at the upper level, they become pre-specified parameters for the subsystems at the lower level.Among other results, Lee et al. show that the decomposition strategy achieved an objective function value 37% lower than SQP spending only 5.5% of SQP's computation time when considering averaged results of data reconciliation for 20 datasets of the steam distribution subsystem.
Faber et al. [123] proposed a nested three-stage computation framework, where the upper stage is a NLP for parameter estimation, the middle stage consists of multiple sub-NLPs in which the independent variables of each individual data set are estimated, and in the lower stage the dependent variables are evaluated through a simulation step.It is argued that only the gradients of the dependent variables to the parameters are required.By using the optimality condition in the middle stage, it is argued that second order derivatives are not required in the upper stage, thus significantly reducing the computational cost.This framework was later validated with data from a pilot plant established for both absorption and desorption studies on the industrial coke-oven-gas purification process [23,95,124].

Optimization Algorithms for Nonlinear Programming
Various NLP algorithms have been used to solve the optimization problem defined by Equation ( 5) either in real industrial applications or in DR studies involving real plant data.These include deterministic, gradient-based methods such as Gauss-Newton or Gauss-Marquardt (e.g., [61,89,100,101]), reduced-gradient methods [10,13,28,51,72,86], and successive quadratic programming, which is the most used among industrial DR works, as well as derivative-free, metaheuristic random search algorithms, such as particle swarm optimization (PSO) [18,64].Applicable deterministic solvers for large-scale nonlinear optimization problems are available as commercial and noncommercial software [48,[125][126][127], and generally incorporate second-derivative information from the optimization model, exploit the sparsity of the KKT matrix and problem structure, deal efficiently with large sets of active constraints, and handle dependent constraints and negative curvature.
Augmented Lagrangian methods have been adapted to large-scale problems including, for instance, the use of second-order information and a bound-constrained trust region method to deal with negative curvature, as is the case of LANCELOT [128].Reduced-gradient methods, such as MINOS [129] and CONOPT [130], approximate second-order information using dense quasi-Newton updates in the reduced space.However, the computational effort grows polynomially with problem size and active set selection is a combinatorial process that may be expensive, especially on degenerate problems [63].The technique of successively solving a series of quadratic programming subproblems (i.e., SQP) has proven itself to be suitable for solving large-scale optimization problems.It involves defining a quadratic program (QP) at each iteration based on a quadratic approximation of the Lagrange function and a linear approximation of the constraints of the original nonlinear program.This leads to a search direction and a line-search step size, which determines the next iterate [131].SQP has been widely used in the process industry [132], but the identification of the active set of inequality constraints may impose serious difficulties during the solution of the QPs in the presence of many inequality constraints, since it can increase exponentially with increasing problem size.Besides, the large number of degrees of freedom in the DR problem can make the SQP decomposition strategy less suited.
Extensions of this technique to deal with many degrees of freedom have been developed [117,118,[133][134][135], exploiting the natural problem structure and the decomposition of the full-space to dense reduced-space quadratic programming problems.Examples of full space barrier (or interior point) techniques are SQPIP [53] and IPOPT [63].SQPIP is a sparse full-space interior point SQP that involves three major tasks: the QP subproblem, a line search procedure, and an approximation method of the Hessian of the Lagrangian, where an important advantage is the guarantee concerning the feasibility of inequality constraints.IPOPT, in turn, is a primal-dual interior point algorithm that incorporates second-order information and a filter line-search strategy that ensures the convergence of the barrier problem.Complete descriptions and a comprehensive analysis of interior point methods can be found in the literature, e.g., [136].
While these methods have been used in industrial applications and commercial optimization softwares do exist, highly nonlinear systems are not easy to solve and often lead to convergence problems.In addition, these are all local methods that offer no assurance that the global minimum in the optimization problem has been found, since the nonlinear optimization problem is generally non-convex.Deterministic global optimization algorithms, which can provide a guarantee that the global optimum within some specified search domain is found, have also been proposed to deal with the possible existence of multiple local minima [137,138].Nonetheless, its use in industrial applications has not been reported to the best of our knowledge.
Optimization algorithms must be efficient and robust, being able to avoid local minima and to handle poor starting points, badly scaled problems and (some) inaccuracies in the gradients, while maintaining a viable computation time [81,139].Indeed, real applications of DR show that factors such as scaling, starting points, sparsity patterns, and thermodynamic approximations are important when solving NLP problems [99,140].As pointed out by Kelly [62], since the ways to determine the next iteration's adjustments to the decision variables are different among NLP algorithms, there is significant value in including several independent solvers when solving industrial problems, so that if one method fails to converge or finds an unreasonable solution then another method can be used in the hopes of finding a better solution.Considering that each algorithm has its own set of advantages and disadvantages, and that there is no optimization algorithm able to perform better than any other for any class of problems [141,142], very few works report comparisons among NLP algorithms in industrial applications.Actually, few works address in detail the numerical issues that arise when designing and applying data reconciliation in real systems.
An exception is the work of Pierucci et al. [61], which presents a test between a SQP, a modification of Levenberg-Marquardt (LM), and a Newton based algorithm (NB).The authors report that SQP and LM had very similar behavior, whereas NB was both slower and less efficient.Another exception is due to Kyriakopoulou and Kalitventzeff [53] based on the frame of the VALI (Belsim) general purpose data validation software.The algorithms tested were the SQPIP, a Lagrangian formulation solved by a dogleg method (SOLDOG), and a dense matrix SQP (SQPHP).Fewer Jacobian and residuals evaluations are demanded by the SQPIP and SQPHP with respect to SOLDOG, whereas SOLDOG is faster than both of them since it requires less total factorizations than in SQPIP and the system of the optimality conditions is more sparse than in SQPHP.Sarabia et al. [54], in turn, applied a SQP (NAG library) and an interior point method (IPOPT-GAMS) but aimed at comparing the sequential and simultaneous solution approaches.The authors only reported that the CPU time to solve the reconciliation problem was up to 10 min in the sequential approach with SQP and 1 min in the simultaneous approach with IPOPT.
For many real applications, the process model is provided by a black-box simulation package, thus not allowing the application of techniques aimed at numerical improvements, such as the one proposed by [143] and applied by Lid and Skogestad [99].Industrial DR applications are typically based on simulation platforms to which optimization solvers have been interfaced and where the models themselves are constructed and calculated through numerical procedures instead of through an open declarative language.In this context, most of these platforms do not provide exact derivatives to the optimization solver, and gradients are often approximated through finite difference [144].Calculation results will thus consequently depend on the technique used for gradient estimation, where the ideal size of the finite-difference step may depend on the application.Nevertheless, such an effect is rarely mentioned so that it is not possible to state unequivocally that data reconciliation procedures are being used successfully in actual industrial environments to improve the operation of large plants [89,100].
When applying the finite-difference derivative approximation is not appropriate or costly or when automatic differentiation techniques cannot be applied, conventional NLP algorithms may not be suitable.In such cases, derivative-free methods that do not rely on the derivative information of the objective function or constraints should be considered [145].In this context, metaheuristic optimization methods, such as the simulated annealing and the genetic algorithm, can be used to overcome the difficulties for obtaining the derivatives.These methods also present some additional advantages, including the ability to avoid local minima, thus featuring a global search character, and simplicity of implementation.Even though these stochastic methods still provide no guarantee that the global optimum has been found, they are a step towards its discovery [138].Besides, the computed values of the objective function can be used for rigorous statistical analyses of the confidence regions of the estimates, which can also constitute an important benefit of these algorithms [146].The PSO algorithm, specifically, can be used effectively for optimization of complex optimization problems due to its capability to handle multimodal, nonlinear and discontinuous objective functions along with large-scale systems, as well as its much smaller computational effort and simpler implementation, when compared to other metaheuristic procedures such as simulated annealing and the genetic algorithm.
Prata et al. [64] presented a procedure for nonlinear dynamic DRPE and its application to an industrial polypropylene reactor, where the technique was developed and validated in real time.The performance of the derivative-free, metaheuristic PSO algorithm and a standard Gauss-Newton (GN) were compared when solving the nonlinear dynamic DRPE on-line.Although the results show that the PSO method is slower than GN, which did not preclude the use of PSO on-line, it was observed that GN leads to worse final objective function values for the cases when large differences are obtained between the algorithms.This was attributed to the convergence to local minima.Besides, it was shown that non-deterministic optimization algorithms might be used successfully in real time applications, as they can provide reliable information without any significant rate of failure.Based on the previous work, Prata et al. [18] presented a robust procedure to perform the joint nonlinear dynamic DRPE problem on-line with simultaneous detection of gross errors.Using robust estimators, the work is the first to apply the PSO technique for the simultaneous solution of the DRPE problem and detection of gross errors in a real industrial system.

Solution Approach
Methods that apply NLP solvers can be separated into two groups: sequential and simultaneous strategies.The difference between them mainly depends on the way the process model constraints are considered.The sequential approach is comprised of an iterative two-stage approach, which solves the system of equations as an inner loop to evaluate the objective function, while an optimization algorithm is the outer loop.As the model convergence problem is solved at each optimization step, it belongs to the class of "feasible path" methods.The simultaneous approach includes the convergence problem inside the optimization problem, belonging to the "infeasible path" approaches.
These classes are a major concern when it comes to solving dynamic optimization problems.For example, the approach for solving dynamic optimization problems applying collocation on the finite elements method, which convert the system of equations into algebraic ones and solve them simultaneously with the optimization problem, follows the simultaneous approach [147].
Liebman et al. [45] also used the successive quadratic programming method with collocation on finite elements and a moving horizon approach to solve the resulting optimization problem.Even though applying this NLP approach in real applications is somewhat limited due to the modeling of the system with explicit differential and/or algebraic equations, it has been applied to industrial systems [15,83].Safdarnejad et al. [148] shows a comparison between various initialization strategies for optimization of dynamic systems using collocation on finite elements.These strategies include a warm start from a prior solution achieved by structural decomposition, a simultaneous, and a sequential approach, which were demonstrated for an industrial CO 2 removing system integrated with grid-scale power generation units including coal, gas, and wind power units, resulting in large system with 768 decision variables.A comparison of the computational time between IPOPT, active-set APOPT [149], and SNOPT [131] algorithms were also made.The APOPT algorithm was also used in the study of Safdarnejad et al. [150], solving dynamic data reconciliation and optimization of laboratory-scale batch distillation columns in which collocation on finite elements was used to convert the system of equations into algebraic ones.More details on solution approaches for dynamic systems can be found elsewhere [151][152][153].
The choice between the two classes of approaches may depend both on the size of the problem and on the difficulty to solve the convergence problem.An approximated rule suggested by Pierucci et al. [61] indicates when a feasible path is preferable, which is associated with the number of iterations required to solve the static process model, including the iterations required to solve the model during the perturbation phase for gradients evaluations.Bussani et al. [71] argue that the choice between feasible and infeasible path methods depends upon both the ratio between the decision variable number and the iteration variable number, and the overall convergence speed.
In the work of Eksteen et al. [86], both the simultaneous (using the Lagrange multiplier method) and sequential (using the generalized reduced gradient -GRG) approaches were evaluated.Systematic biases were identified through traditional statistical methods and, in general, appeared to be larger when the simultaneous approach was used.Faber et al. [123] suggested solving the estimation of parameters and the data reconciliation problem for large-scale nonlinear models with a sequential approach, which has been applied to the DR of a pilot-scale coke-oven-gas purification process [23,95,124].Martínez-Maradiaga et al. [154] applied a similar method, although considering only the data reconciliation problem for a test bench single-effect ammonia-water absorption chiller.
In a work involving hydrogen networks in an oil refinery, Sarabia et al. [54] implemented both simultaneous and sequential approaches.The former was based on the interior point algorithm IPOPT-GAMS, where some enhancements were made in order to avoid feasibility problems, while the latter consisted of a model simulated in the EcosimPro environment linked to an SQP algorithm from the NAG library.As formerly discussed, it was reported that the CPU time necessary to solve the reconciliation problem was 1 min in the simultaneous approach in GAMS and up to 10 min in the sequential approach with the NAG libraries.Unfortunately, no additional comments were provided.
As can be seen from previous studies, there is neither a detailed assessment of solution approaches nor a consensus regarding which is better under what conditions.

Process Description
The underlying system of this study represents an oil production facility, which extracts oil and gas from the reservoir to the refinery.It is based on an oil process unit from Petrobras.The primary treatment is comprised of two production trains and one test train, as depicted in Figure 1.Each train is equipped with a manifold, a heat exchanger and a three-phase gravitational separator.The output streams of oil leaving each three-phase separator are joined together and feed an atmospheric separator.During actual operating, manifolds A and B receive the mixture arising from the production wells, while the test manifold, which is responsible for acquiring a set of process measurements, runs with a single well at a time.In the three-phase separator, most of the gas is separated from the oil and water, whereas in the atmospheric separator the residual gas, oil and water are all separated to the greatest extent possible.
The process plant was modeled using a process simulator software developed by Petrobras and validated along years of production operation.In addition, the model incorporates the heat exchanger model from the HTRI Xchanger software (Heat Transfer Research, Inc., Navasota, TX, USA).The system's model has a total of 24 process variables, all of them measured, along with five parameters, as shown in Table 2, where types 1 and 2 represent reconciled input variables and parameters, respectively, and type 3 indicates measured output variables.
A simulated steady-state point representing the daily average of typical process operation in the real system have provided the set of true values (z * ) and was used for generating the set of measured values (z ms ).Simulation was performed on a daily basis so as to better resemble the implementation in a real system, in which it is desirable to eliminate the effect of variations in operating wells and minimize the effect of residual inventories.In this context, data reconciliation is carried out for a complete day of operation regardless of how frequently measured data is available.The measurement samples were obtained with a normal Gaussian distribution random number generator assuming a variable specific standard deviation and a mean equal to the true value.The coefficient of variation (CV, also known as relative standard deviation) of the variables is selected to simulate the estimated accuracy of the instrument taking the reading of each variable and are presented in Table 2. Five measurement sets for each variable were created, representing different samples of the same steady-state.The cumulative distribution function of the variability around the true values for each sample set is presented in Figure 2.

Case Studies
Based on the works reporting industrial application of DR, we have set up a series of simulation experiments on steady-state data reconciliation adopting different configurations.Each configuration setting is comprised of a solution approach of the DR problem, objective function (estimator), optimization algorithm, technique for gradient approximation, and initial guess, as indicated in Table 3. Configuration test procedures consisted of solving the data reconciliation problem for each of the five sample sets and analyzing the performance metrics.The data reconciliation procedure has been implemented in MATLAB [155], where the script manages the function calls to the optimization algorithm and the process model.The model is available as a dynamic-link library (dll) and, as a result, it is not possible to access the model source-code, inspect the equations implemented, edit and/or implement any sort of strategy to improve its numerical properties, such as scaling variables and equations.Calculations were performed on a 64 bit Windows 7 system with Intel Core i7 CPU 2.0 GHz and 6 GB RAM.At first, the data reconciliation was solved with the MATLAB fmincon routine using the IP algorithm.The fmincon routine is intended for solving constrained nonlinear multivariable functions and is available in the MATLAB Optimization Toolbox.Regarding solution approach, the mathematical programming problem was solved by both sequential and simultaneous approaches.In the former, process simulation and minimization of the objective function follows a two stage approach, in which the first and decoupled simulation stage calculates the state variables ensuring that model constraints are always satisfied.By doing this, only the objective function and the inequality constraints are by the optimization algorithm.In the latter, the optimization algorithm handles the objective function and the constraints, including the process model equality constraints.In practice, the simultaneous approach was achieved by passing the process model equality constraints as the argument defined by the nonlcon parameter of the fmincon routine.
In general, measurement values are used as initial estimations for the measured process variables [36].In spite of this, given the influence of the initial guess in the performance of optimization techniques, four different initial guesses were used to test the performance of the overall DR procedure.
In addition, fmincon admits an option to set the type of finite difference used to approximate gradients, which are either forward (the default), or centered.Both types were tested when running DR experiments.
It must be noted that for all simulation experiments the exit condition of fmincon was the state where changes in decision variables and maximum constraint violation were less than the defined tolerances, unless otherwise stated.DR simulations used the default values for optimization options from fmincon except for those listed in Table 4. Previous simulations using default values for all options presented significant differences regarding convergence and motivated these changes.
A hybrid optimization algorithm was then used in the refined settings, combining the deterministic IP technique with the metaheuristic PSO algorithm.They were combined such that the procedure started with the PSO for a few iterations is finished with the IP technique, where the solution given by the PSO is transmitted as the initial guess of the IP.By doing this, we intend to improve solution robustness based on the global search character of the PSO, and subsequently improve convergence speed with the IP.Parameters for the PSO algorithm are listed in Table 5.The problem of data rectification is posed as a maximum likelihood problem, where the probability of the estimated plant state is maximized given the measurement set.In this context, the appropriate estimator considering the zero mean, independent and normally distributed measurement errors is the WLS.The problem of data reconciliation with simultaneous parameter estimation (DRPE) is then stated following Equation ( 5): {ẑ, θ} := arg min z m ,θ where (z L m , z U m ) and (θ L , θ U ) are the lower and upper bounds of the reconciled and estimated variables, and the parameters, respectively.Notwithstanding this formulation, we have modified the WLS estimator from Equation ( 6) by the introduction of parameters θ (i.e., non-measured variables) in the objective function, thus resulting in a modified WLS, non-maximum-likelihood estimator as follows: where V θ is the weighting matrix for parameters and θ ms are reference values for estimated parameters.Therefore, both WLS and modified WLS estimators were analyzed.The quality of the results obtained by the DRPE procedure using different configuration settings was quantified by analyzing the solution robustness, constraint violation at convergence, and computational speed.The main performance metric regarding solution robustness is based on the statistical theory of the maximum likelihood method.Since the measurements are normally distributed about their true values and no gross errors are present, it is imperative that the objective function follows a Chi-square distribution χ 2 at the minimum.Based on this, we propose the use of the global (or collective) test [120] not to detect gross errors, but rather to indicate the satisfactory solution of the minimization problem.In this test, the magnitude of the objective function is compared to the probability limit of the Chi-square distribution for the degrees of freedom (DF) corresponding Looking at both charts of Figure 3 (especially chart A), the objective function is always greater than the Chi-square value, which suggests the existence of a gross error in the data according to the global test.This is not the case when the problem is initialized with true values, however.In this context, we propose the use of the global test to detect inconsistencies in the formulation and in the solution of the problem.Since the data do not contain gross errors, the fact that three of the initialization cases present higher final objective function values than the Chi-square value is related to a local solution of the optimization problem, which depends on the optimization algorithm.Besides, two additional factors arise, which are the initial guess and finite difference technique applied for gradient estimation.
As expected, Figure 3A shows that for forward finite differences different initial guesses resulted in different numerical performances of the optimization algorithm, where using true values as the initial guess provided the best results.Besides, the use of true values + 10% has given the worst results, reflected by the high values of J DRPE .
As pointed out by Schladt and Hu [97], the initial values of the measurable and unmeasurable variables are significant for the performance of the optimization problem, where the nearer the variables are to their optimal values, the faster and more reliable the optimization run.Nonetheless, depending on the optimization algorithm, the authors also state that a very close starting point of the optimization to the optimum might not converge to the optimum, due to the small gradients.Such cases might be treated by superimposing the initial values by random noise.An alternative method is proposed in [62], where the measurement values are used as initial values for the measurable variables, whereas the starting values for the unmeasurable variables are approximated iteratively.Although the use of measured values yielded reasonable results relative to the alternatives tested, which is the most common practice among real applications, they still reflect a local solution of the optimization problem.
It is interesting to note in Figure 3B, however, that simply changing the type of finite difference from forward to central dramatically improved the results for most of the cases.Detailed results on final objective function values are presented in Table 7.This is surprising, as it is generally assumed that the availability of good initial guesses is sufficient to guarantee good performance of optimization techniques [81], although results show that this is not always true.Indeed, it seems that the work of Prata et al. [101] is the only one to mention something about the type of finite difference, in which the central finite difference is used.
Dave et al. [156] reported that the SQP-NPSOL was very sensitive to the value of the perturbation given to the variables for gradient calculation, where accurate gradient determination required the use of a small perturbation.It was reported that the existence of a trade-off, in which too small values led to the simulator becoming insensitive to the changes and for large values of the perturbation, on the other hand, the gradients become inaccurate.The value of 5.00 × 10 −3 times the value of the variables was found to give reasonably accurate and stable gradients.However, no additional comments regarding the type of finite differences were provided.
One could expect that such an improvement would not be possible without an increase in the computational cost.As it can be seen in Table 8, it was not the case for half of the initial guesses tested.Similarly, Table 9 also shows improvement in the computational cost when considering the averages of different samples.Even for those cases that presented higher solving times, the computational cost is far from being prohibitive.
Recalling that results are due to the same stationary operating point, it is clear that the numerical precision of gradient estimation may not allow an adequate performance of the IP algorithm in solving the DRPE problem (in addition to initial guess, presence of gross errors and model mismatch), which will provide results out of acceptable statistical limits.In these cases, results might be surprisingly dependent on the discretization scheme used for gradient estimation, especially when the derivatives are computed numerically, which constitutes a real and common problem when the process model is not accessible by the user, as in the present case.This is evident from the results for both objective function values and decision variables of the optimization problem (data not shown).Regarding the later, acceptable results for all decision variables were only obtained for settings 2b, 2c and 2f.
Efficient and robust optimization algorithms must be able to handle not only poor starting points, but also badly scaled problems and some inaccuracies in the gradients [139].Using Aspenplus with an SQP, Piccolo et al. [13] reported that the Aspenplus default values were found to be adequate for simple problems, but in complex cases where the system is very sensitive or highly nonlinear, certain problem specific adjustments may have to be made.Unfortunately, no additional comments on such adjustments were provided.From this perspective, the numerical precision of gradient estimation could be a potential factor to be investigated.In the case of an unsatisfactory DRPE solution (or a detected global gross error), the result of the optimization problem must be at least marked as not trustworthy.In the study of Schladt and Hu [97], for example, the approach applied is to discard the result in case a global gross error is detected.Another less severe approach would be repeating the DR(PE) until it reaches an error free condition.

Problem Formulation and Optimization Algorithm
Given the more realistic use of measured values as an initial guess, we continue on testing settings in order to improve the solution of the DRPE problem for this case.In this section the tests regarding two different estimators are presented, since the problem formulation is one of the possible sources of biased results.
Results shown in Table 10 were obtained for settings 2a and 3, as described in Table 3. Regarding the final objective function value, results for problem formulation indicate a greater difficulty of the optimization algorithm to find an adequate solution.In spite of having smaller computational cost, samples 1, 2, and 5 presented higher objective function values.Interestingly, however, the performance of DRPE for sample 3 has been further improved.Figure 2 shows that sample 3 has the smaller variability among all samples, which could be helping the achievement of good results.
When the values of the final objective function are analyzed in conjunction with the metric regarding the confidence region of reconciled values, results get even more interesting.While there has been an increase in the final objective function value for sample 1, Table 11 shows that fewer variables were out of the confidence region of 99.7%.A similar result is observed for sample 2, where an increase of 116% in the final objective function value did not affected the quality of reconciled values.In this context, samples 3, 4, and 5 showed regular behavior, except for the decrease of 11% in the objective function value of sample 4, being responsible for such a great improvement in the quality of reconciled values.
Based on the above discussion, it is evident the effect of estimators on DRPE results.The inclusion of non-measured variables among objective function variables has induced the occurrence of bias in the estimated variables/parameters. What is not clear from the presented results is why the maximum likelihood estimator is still presenting variables outside their confidence limits.To answer this question, we devote attention to the optimization algorithm.Table 12 presents a comparison of settings 3 and 4, which only differ on the optimization algorithm employed.In setting 4, the minimization procedure started with the PSO algorithm in order to explore the search space, taking advantage of its global search character.After a few iterations, the solution given by the PSO is transmitted as the initial guess of the IP, which is expected to further refine the solution and improve convergence speed.It can be seen that the use of the hybrid algorithm presented improved results, where final objective function values are within the confidence region for all samples.As also indicated by the metric regarding the confidence region of reconciled values, all variables are within the confidence region of 99.7%, showing the DRPE solution is efficient and robust.Figure 4 provide detailed results concerning the cumulative distribution function of the variability around the true values for each sample set, where it can be seen that setting 4 significantly improved the accuracy of the reconciled variables.Even though these results are achieved at a higher total computational cost, the PSO contribution is constant and predictable, which helps controlling this variable.The IP contribution to computational cost, in turn, is very similar to other DRPE experiments shown before.

Conclusions
Data reconciliation is already an essential tool for many industries and many industrial applications are described in the literature, where the majority of works are related to steady-state nonlinear data reconciliation.In spite of this, very few of them evaluate important numerical aspects of this problem.Distinct settings, even those represented by minor differences, may present different performances and should be analyzed during the design of a DR application trying to answer whether the optimization problem is being satisfactory solved.Although the parameter estimation is intended mainly for off-line applications, as generally performed during empirical process modeling, data reconciliation and model updating is an important step in real-time optimization.
In this work, we aimed at providing a numerical assessment of the performance of different settings for solving nonlinear steady-state data reconciliation.The proposed assessment is based on the operation of a real offshore oil production process, which presents a highly complex and nonlinear behavior.We have discussed some components that take part in an industrial application of data reconciliation and showed how they may interfere in the appropriate solution of the resulting mathematical programming problem.
It was shown that the numerical precision on gradient estimation involved in derivative-based optimization algorithms affects the DRPE solution, which does not satisfy the statistical assumptions.It was also shown that the solution approach may greatly affect the quality of the DRPE solution as evidenced by infeasible solutions.The solution of the DRPE problem using an interior point algorithm by means of the simultaneous approach (i.e., Lagrangian relaxation method) provided results that violate the problem's constraints, which is due to the activation of some process constraints and the finite precision of numerical methods.Greater quality and more reliable results have been obtained when stochastic implementations were combined with a deterministic algorithm in hybrid procedures.The hybrid algorithm applied in this work consisted of initiating the minimization with the random search algorithm PSO, taking advantage of its global search character so as to explore the search space, and subsequently apply the deterministic algorithm IP, which is expected to further refine the solution and improve convergence speed.Even though such a combination of algorithms have been already proposed [160][161][162], the quantitative assessment in data reconciliation applications is almost absent.The hybrid procedure combined with the sequential approach was able to provide robust and feasible solutions for all samples analyzed, where both final objective function values and reconciled values were within the confidence region.In other words, it is shown that the optimization problem was satisfactorily solved.
Further extensions of this current application include using the reconciliation procedure with real industrial data in conjunction with a fault/gross error detection strategy and the possible reformulation of the objective function.

FFigure 1 .
Figure 1.Simplified diagram of the oil processing plant analyzed.

Figure 2 .
Figure 2. Cumulative distribution function of the variability around the true values for each sample set.

Figure 3 .
Figure 3. Final objective function values after solving the DR problem as a function of the initial guess.(A) Forward finite difference; (B) Central finite difference.
of the variability around the true values for each sample set for the DRPE res

Figure 4 .Figure 5 .
Figure 4. Cumulative distribution function of the variability around the true values for each sample set for the DRPE results of settings 3 and 4.

Table 1 .
Real industrial applications of data reconciliation techniques or data reconciliation studies involving real plant data.

Table 2 .
Information regarding process model variables and parameters.

Table 3 .
Configuration settings of simulation experiments.

Table 4 .
Non-default values of optimization options used in fmincon routine in DR runs.

Table 5 .
Parameter values of PSO algorithm.

Table 7 .
Final objective function values after solving the DR problem as a function of the initial guess and type of finite difference.

Table 8 .
Elapsed time (min) for solving the DR problem as a function of the initial guess and type of finite difference.

Table 9 .
Averaged results of objective function values and computational cost for different samples according to the type of finite difference.

Table 10 .
Objective function values and computational cost as a function of problem formulation.

Table 11 .
Number of variables whose reconciled value is out of the confidence region of 99.7%.