1. Introduction
The increasing amount of
and other greenhouse gas (GHGs) emissions caused by anthropogenic activities—mostly from the combustion of fossil fuels—are shown to be responsible for global warming [
1,
2]. Net-zero emission plans consider that by 2050 the energy production provided by oil, coal, natural gas and other fossil fuels will still take around 20% [
3]. Fossil fuel power plants are now integrating carbon capture and storage technologies to address the intermittency of weather-dependent renewable energy sources. This initiative aims to enhance the reliability of the energy system by providing alternative power sources during periods of instability in renewable energy supply, thereby facilitating a smooth transition in energy transformation [
4]. Monoethanolamine (MEA)-based post-combustion carbon capture (PCC) technology has evolved into a mature and widely implemented solution for mitigating greenhouse gas emissions from fossil fuel power plants and various industrial processes [
5,
6]. This method aims to efficiently absorb
from exhaust gases, preventing its release into the atmosphere and facilitating its secure storage downstream [
7].
Steady-state and dynamic models for PCC absorption columns are well established [
8,
9], but most works emphasize temperature profiles and estimation of
concentration profiles remains limited [
10]. Availability of accurate
profiles enable optimal MEA circulation-rate selection, improving capture efficiency and reducing energy use, and provide key states for controller design and safe operation [
11].
Gas absorption processes frequently face missing
measurements because gas analyzers sample locations sequentially and are subject to delays; gaps in the data degrade training quality and model generalization. Conversely, high-dimensional plant data introduce redundancy that obscures the most informative variables. Robust estimation therefore benefits from (i) principled handling of missing data and (ii) dimensionality reduction and noise filtering. Soft sensing is a mature approach for estimating unmeasured or asynchronously measured states [
12] using mechanistic or data-driven models and has strong potential for use in PCC processes.
We consider the development of a soft sensor for PCC absorber columns using a hybrid framework. First, we develop a simplified mechanistic model of the absorber. Second, we design a moving-horizon estimator (MHE) that uses plant measurements to reconstruct continuous, physically consistent concentration profiles despite missing measurements. Third, these MHE estimates provide labels for supervised training of an end-to-end SDAE–GRU soft sensor that learns compact features and temporal dynamics. Finally, we combine mechanistic and data-driven predictions within a unified hybrid predictor to forecast profiles across the column.
1.1. Related Work
Accurate real-time monitoring starts with a dynamic model that reflects the process physics. Mechanistic models are attractive because they encode conservation laws, thermodynamics, and kinetics. They can predict reliably across operating regimes, but their complexity can hinder control and optimization. On the mechanistic side, Morgan et al. developed a thermodynamic framework for VLE, enthalpy, and solution chemistry; they regressed e–NRTL parameters in Aspen Plus
® to VLE, heat-capacity, and heat-of-absorption data and reduced complexity using information-theoretic criteria [
13]. Putta et al. used a two-dimensional, rate-based absorber model to study how choices in thermodynamics (VLE, Henry’s law constant), reaction-kinetic correlations, and
diffusivity affect predicted capture performance [
14].
Data-driven models, especially neural networks, are faster to develop and cheaper to evaluate. Their main limitation is distribution shift: performance degrades when conditions depart from the training set. A hybrid strategy mitigates these trade-offs. It combines physics (for extrapolation and interpretability) with learning (for flexibility and speed). For example, Xing et al. coupled extended adaptive hybrid functions with a multiphysics model for data-driven, multi-objective optimization and techno-economics of the GDE-based
process [
15]. Caprio et al. benchmarked hybrid regressors—ridge, decision-tree, SVM, and feed-forward ANN—on
capture spray columns [
16]. Tian et al. construct performance prediction models for a carbon capture experimental system embedded with physical constraints, utilizing three machine learning algorithms: Random Forest (RF), Back Propagation Neural Network (BPNN) and Convolutional Neural Network (CNN) [
17]. Jiang et al. propose a novel model, named ICEEMDAN-Inception-Transformer, to thoroughly explore the relationship between power data and carbon emissions, providing precise hourly carbon emission acquisition for power enterprises [
18]. Although the aforementioned AI-based methods demonstrate certain advantages in prediction accuracy compared with alternative approaches, their computational cost is relatively high due to model complexity. Moreover, these models strongly rely on high-frequency and high-accuracy measurements, which are difficult to obtain in practical industrial settings. In addition, legacy carbon capture systems are often not readily compatible with purely AI-based models.
Both pure and hybrid models can drift during deployment because of disturbances, noise, and modeling error. State estimation closes this gap. State estimation methods such as Kalman filtering and MHE fuse models with measurements to deliver consistent state trajectories [
19]. MHE formulates a constrained optimization over a sliding window to minimize the mismatch between predicted and measured outputs while enforcing the process dynamics [
20]. It updates the estimate at each sampling time, handles nonlinear models and constraints, and has been applied to many noisy process settings [
19,
21]. Its ability to incorporate nonlinear state and measurement equations makes it well suited to absorption columns [
22]. Wang et al. illustrated this by pairing an LSTM surrogate with an MHE layer to robustly monitor key
capture variables under unknown disturbances and sensor faults [
11]. However, in their study, the proposed methodology is relatively complex and computationally expensive, involving the calculation and tuning of the MHE covariance matrices, frequent updates of the machine learning models, and a strong reliance on high-density and high-accuracy raw measurement data. In our study, the entire system can be simulated and its states estimated using fewer and coarser measurements, while the update procedure of the model covariance matrices is simplified to reduce the overall computational burden by adding covariance localization.
High-dimensional plant data introduce additional challenges: noise, redundancy, and asynchronous sampling. Feature learning and dimensionality reduction address these issues. Autoencoders learn compact latent variables that preserve salient structure [
23,
24]. Denoising autoencoders (DAE) improve robustness to corrupted inputs [
25]; stacked DAEs (SDAE) add depth, enabling hierarchical feature extraction and better generalization in high-noise, high-dimensional settings [
26]. However, features optimized only for reconstruction are not always ideal for forecasting [
27]. End-to-end training—jointly optimizing a (S)DAE front-end with a time-series predictor—aligns representation learning with the prediction loss and often improves accuracy and simplicity [
28]. Hybrid encoder–decoder models with GRU decoders have also advanced sequence forecasting in related energy applications [
29].
Covariance localization further stabilises estimation in the presence of limited data. The Gaspari–Cohn taper is a compactly supported, fifth-order correlation function used to attenuate spurious long-range correlations via a Schur (Hadamard) product with the background covariance. It preserves positive semidefiniteness and is standard in EnKF and variational assimilation [
30]. Stanley et al. extended localization to multivariate settings while maintaining positive semidefiniteness, enabling consistent updates across strongly coupled variables [
31].
1.2. Contributions of the Paper
This paper is an extension of Zhuang’s work [
32], building on the background outlined in the previous subsection, this paper makes the following key contributions:
1. Development of a simplified mechanistic model: A more compact mechanistic model is developed for the absorption column in the carbon capture pilot plant. The compact mechanistic model can predict the concentration and temperature at different locations of the absorption column accurately making it suitable for real-time applications.
2. Development of an MHE framework: A moving horizon estimator is designed based on the mechanistic model. This framework integrates real-time measurements from the pilot plant, providing a more robust and accurate estimation of the concentration profile compared to open-loop mechanistic model predictions. Compared with prior work, the use of Moving Horizon Estimation (MHE) in this study is essential due to the discontinuous nature of CO2 concentration measurements in absorber systems caused by limitations of gas analysis equipment. While the training of downstream machine learning models requires complete concentration trajectories, previous work relied on simple linear interpolation to reconstruct missing measurements. However, linear interpolation has no physical basis and does not reflect the actual dynamic evolution of CO2 concentration in the absorber, and is therefore inconsistent with practical operating conditions.
In contrast, the proposed MHE framework is built upon a mechanistic model and incorporates additional external measurements to reconstruct physically consistent CO2 concentration profiles. The resulting concentration trajectories provide a realistic representation of the underlying process dynamics and serve as reliable training data for subsequent machine learning models.
3. Training of SDAE-GRU models: Considering the high computational cost associated with running the MHE in real-time, we used the offline results from MHE runs as labels to train a robust machine learning model representing a more practical and efficient alternative.
4. Hybrid modeling framework: The predictions from the SDAE-GRU model are integrated with the predictions generated by the standalone mechanistic model. This fusion is performed based on the respective error covariance matrices of their predictions, while the Gaspari-Cohn method is simultaneously applied for covariance localization. Error covariance matrices are updated in batches by running the MHE on windows of selected past data.
1.3. Organization of the Paper
The rest of the paper is structured as follows:
Section 2 provides a description of the carbon capture system and the data sets under study.
Section 3 outlines the methodology, including the mechanistic model, SDAE-GRU framework, the MHE formulation, and the data fusion method.
Section 4 presents the results obtained from the implementation of the proposed solution using pilot plant data mimicking online operation of the state estimator, and finally
Section 6 presents the conclusions and future research directions.
2. System Under Study
Figure 1 shows a simplified flowsheet of the pilot-scale carbon capture plant utilized in this study. The input gas stream is generated by mixing heated
and
in varying ratios to simulate flue gas produced by upstream natural gas or coal-fired power plants. To simplify operation and minimize potential corrosion of the equipment, sulfur oxides, nitrogen oxides, and particulate matter are excluded from the gas mixture. The prepared gas stream is first cooled and then introduced at the bottom of the absorption column, then the flue gas rises through the packed sections of the column, where dissolved
is removed via a reversible reaction between
and monoethanolamine (MEA) in a lean aqueous MEA solution supplied from the top of the column. This process removes more than 95% of the
. The
-rich gas exiting the absorption column is collected and reused for flue gas preparation. The
-rich MEA solution is directed through a heat recovery system to a stripper column, where the reaction is reversed and
is released by applying heat through a reboiler. The regenerated lean MEA solution is recycled back to the absorption column, while the recovered
is collected and, along with the
, reused for preparing the input gas stream.
Figure 2 illustrates the overall configuration of the carbon capture pilot plant, as well as the structural details of a single stage within the absorption column. Due to limitations in the shooting angle and the internal structure of the building, it is not possible to capture all five stages in a single image. The left image therefore shows only the third to fifth floors. Gas flows upward while liquid flows downward between stages, forming a countercurrent pattern through the packing structure. The right image provides a detailed view of an individual stage. The left section corresponds to the absorber column, while the central part represents the condenser located at the top of the stripper tower, where MEA solvent vapors are condensed and recirculated back into the tower via dedicated piping.
Figure 3 illustrates the carbon dioxide detection system. Gas from within the column stage is directed into a dedicated measurement pipeline and subsequently delivered to the gas analyzer (AT400) located at the top of the tower. Before entering the analyzer, the gas passes through a separator designed to prevent any entrained liquid from reaching the instrument. The control panel on the analyzer allows the operator to select specific sampling points for gas composition analysis. Additionally, due to the large size and high cost of the gas analyzer, simultaneous installation of multiple units at different locations is impractical. As a result, carbon dioxide concentration measurements are discrete rather than continuous, and it is not possible to obtain real-time data from all locations simultaneously.
The developed model is validated with experimental data collected from the PCC pilot plant. The measurements used for validation consist of temperature, pressure, liquid and gas flow rates, liquid level, pH and the previously mentioned
concentrations. The 7 datasets used in this study contain around 1500 samples (data points) in total. After removing outliers, the data are divided into seven sets corresponding to different operating conditions, determined by the average gas–liquid volumetric ratio of inlet flue gas and the input
volumetric concentration (%), as shown in
Figure 4.
As noted earlier, only a single gas analyzer is available for
concentration measurement. The resulting measurements are presented as scatter plots in
Figure 5 and
Figure 6. Data are collected sequentially from sampling point 5 to point 1, as illustrated in
Figure 1, with two to four samples acquired at each location. Because a single analyzer is shared across multiple sampling points, measurements are not obtained simultaneously, and the
readings at each location are therefore temporally discontinuous. However, effective control and optimization of the overall system require complete, real-time information on the
concentration profile. To address this limitation, the proposed method described in the next section was applied to the carbon capture system to provide reliable real-time estimates of the
concentration profile.
3. Methodology
In an earlier work, the estimation objective described in the previous section was tackled by [
32], using a combination of mechanistic and data-driven models but without considering any direct feedback from measurements in online operation, at the same time, the raw data used to train the LSTM was obtained by performing linear interpolation on scattered data points. As a result, the original data inherently contains distortion issues. In this work, our aim is to incorporate a state estimator to this approach, which can be seen as an advancement of the mechanistic model component to provide optimal data for the training of the machine learning model.
The proposed framework integrates a first-principles mechanistic model, a machine learning model, and a Moving Horizon Estimation (MHE) scheme to achieve accurate and robust state estimation under realistic operating conditions. The MHE framework utilizes real measurement data collected from the pilot plant to generate optimal continuous estimates of concentrations, effectively addressing the challenges posed by unknown disturbances, process noise, and missing or unreliable sensor measurements. These high-fidelity state estimates are then used as supervisory target values for training a robust hybrid data-driven model, implemented as a Stacked denoising Autoencoder–Gated Recurrent Unit (SDAE-GRU) network. The choice of employing a hybrid model rather than directly using MHE with the mechanistic model for real-time prediction stems from computational limitations. Since the MHE must solve a large number of nonlinear ODEs, its execution speed is only marginally faster than the operation of the absorption column. Moreover, the time required for system sampling further hinders the feasibility of running the MHE in synchrony with the system. Therefore, in this study, the simultaneous application of MHE may be marginally feasible; however, to enhance the generalization capability of the proposed approach, we adopt a periodic use of MHE to generate optimal estimates, which are then employed to periodically update the error matrix of the hybrid model. By learning from the optimal estimates provided by the MHE, the SDAE-GRU model is able to capture both the underlying process dynamics and the temporal correlations present in the measurement data.
To further improve prediction accuracy and reliability, the open-loop predictions obtained from the trained data-driven model and the standalone mechanistic model are fused through a covariance-weighted blending strategy. This fusion combines the strengths of both modeling approaches while mitigating their individual weaknesses, with the blending weights determined based on the respective deviation covariance matrices of each model’s predictions. Regarding the update of the covariance matrices for both the mechanistic model and the machine learning model, the MHE cannot operate in real time with the hybrid model due to the large number of nonlinear ODEs involved. Therefore, in our test dataset, we enlarged the data window and updated the covariance matrices once every dataset interval. This provided the MHE with sufficient computation time. In practical operation, each update introduces a delay of approximately 20–30 min. However, since the system itself is highly periodic, such a delay does not lead to any significant distortion. The resulting hybrid estimation framework provides a computationally efficient and scalable solution for real-time concentration monitoring in industrial carbon capture processes, offering enhanced adaptability to changing operating conditions and improved robustness against measurement uncertainties.
3.1. Mechanistic Model
The proposed mechanistic model is a simplified well-mixed approximation of the absorption process derived for real-time calculations. The absorption process is assumed to happen only within compartments including the five packing stages of the absorption column. In
Figure 7, the whole column is built as 5 continuous stirred tank reactors (CSTR) in our model, there are eight measurements for the input of the whole model, the four correlated to the MEA solvent including the mass flow rate, MEA mole concentration, MEA_
concentration, and temperature of the solvent are given in the top, and the four correlated to the flue gas including mass flow rate,
concentration,
concentration, and gas temperature are given in the bottom. The model is based on mass and energy balances given by the Equations (
1)–(
10).
All five stages share the same equations, i = {1,2,3,4,5} is the number of different stages of the absorption column. denotes the reaction rate coefficient, denotes the reaction rate, represents mole concentration of different components, and the subscripts , , and denote the liquid phase, gas phase, inlet and outlet, respectively, stands for the heat capacity within the stage, and stand for the liquid and gas volume of a single stage.
The reaction mechanism is illustrated in Equations (
1) and (
2). The values of the parameters A and b (123,147 and 41,236.7 respectively) are optimized by using the optimization algorithm fmincon for constrained nonlinear minimization in MATLAB minimizing the prediction error for the column temperature.
is the gas constant. For the reaction rate equations, the impact brought by the reverse reaction in these conditions is very small compared with the forward reaction, thus the reverse reaction is not illustrated as an independent reaction rate equation, it is instead reflected in a smaller forward reaction rate constant.
The ordinary differential equations that describe the energy balance are illustrated as Equations (
3) and (
4). In these equations, T denotes the temperature within the stage, and Δ
H denotes the
absorption heat. In this study,
denotes the temperature change within the stage. The liquid and gas phases are considered as a single combined system; therefore, they are assumed to share the same temperature change.
The ordinary differential equations that describe the mass balance are illustrated as Equations (
6)–(
10).
denotes the flow rate and
denotes the hold up of liquid and gas at each stage.
denotes the molar mass of carbon dioxide.
3.2. Moving Horizon Estimation
In this subsection, the aforementioned mechanistic model is shown as the nonlinear process model in the following form:
where: (1) The index
i denotes the discrete time instant. (2)
,
, and
represent the dimensions of the state, input, and measurement vectors, respectively. (3) The function
F corresponds to the mechanistic model introduced in
Section 3.1, model that projects the system state onto the observable space. Real-time temperature measurements obtained from the column are incorporated into the MHE framework to refine the internal state estimates, with particular emphasis on the
concentration profiles. (4) The process disturbance
is modeled as a zero-mean multivariate gaussian variable with covariance
Q, i.e.,
. Likewise, the measurement noise
follows a zero-mean gaussian distribution with covariance
R,
. The two noise sources are assumed to be statistically independent, and both
Q and
R are taken as diagonal matrices. (5) The admissible state set
X encodes the physical feasibility constraints, including upper and lower bounds on the states.
In the MHE configuration designed for the absorber model described in
Section 3.1, the process is assumed to operate under the nominal conditions. To avoid unrealistic variations arising solely from the optimization routine, the disturbance
is restricted to lie within the interval
.
For the nonlinear system specified by Equations (11)–(15), the standard MHE formulation [
19,
20] is expressed as the optimization problem in (
16), where the state trajectory and the associated disturbances over the estimation horizon are determined by minimizing the following objective function:
where: (1)
denotes the estimated system state at time
i; (2)
represents the corresponding measurement obtained from the plant, while
is the estimated output generated through the observation model; (3)
N denotes the length of the estimation horizon, which is set to 20 following common practice; (4) The arrival cost term
in Equation (
17) incorporates information from system behavior prior to the current MHE window [
27]. This term ensures that the estimator accounts for dynamics that occurred before the present horizon; (5)
is the estimated initial state at the beginning of the horizon, whereas
is the a priori prediction obtained by propagating the mechanistic model together with the previously estimated state at time
. The matrix
denotes the covariance of the approximated posterior distribution of the states at time
[
33].
The last term of the objective function penalizes deviations in the initial state at the beginning of the estimation window. In our implementation, the covariance is selected to be sufficiently small so that the reconstructed state trajectory remains smooth. The process disturbance covariance Q contributes to the second component of the cost function, whereas the measurement noise covariance R enters the observation-mismatch term. All three covariance matrices P, R, and Q are assumed to be diagonal.
In this work, the distributions—or suitable approximations of the distributions—for the process disturbances, measurement noise, and the arrival-cost uncertainty are presumed to be known in advance. This assumption is consistent with applications in which the system operates close to its nominal conditions, as is the case for the plant considered here. The explicit numerical values adopted for these covariance matrices within the state estimation framework are provided in
Section 4.
With respect to the optimization variables in the MHE formulation, the initial state at the beginning of each estimation window appears in the arrival-cost term, where it represents the prior information entering the horizon. For the remaining terms of the objective function, the decision variables correspond to the sequence of states within the window, which evolve according to the system dynamics. In the measurement-mismatch term, the observation model is applied to these same in-window states. Thus, across all components of the cost function, the optimization variables consist of the state trajectory over the estimation horizon.
3.3. Machine Learning Framework
The raw data collected from the plant consist of 53 temperature transmitters, 11 pressure transmitters, 19 flow transmitters, 4 level transmitters, 2 pH analyzers, and the previously discussed gas concentration analyzer. To some extent, these variables are all related to the concentration profile in the absorption column. However, to prevent excessive input dimensionality and to reduce redundancy and noise while retaining the dominant process information, a SDAE is first employed to compress the data. Furthermore, since the concentration profile is a time-series, a GRU network is connected downstream of the SDAE to perform sequential prediction. The SDAE and GRU are trained in an end-to-end integrated manner to ensure model consistency and predictive accuracy.
3.3.1. Stacked Denoising Autoencoder
An autoencoder (AE) is an unsupervised neural architecture designed to learn a compact representation of the input data by reconstructing the original signal from a lower-dimensional embedding. The network consists of two parts: an encoder that transforms the input into a latent feature space, and a decoder that maps this latent representation back to the input domain. When used for dimensionality reduction, the latent layer typically contains fewer neurons than the input layer, forcing the model to extract the most informative structure of the data. The general form of the AE can be expressed as:
where
represents the
n-dimensional input vector, and
denotes the
m-dimensional latent representation. The noise-corrupted input is written as
, while the reconstructed output is expressed as
. The encoder uses parameters
and
, and the decoder is parameterized by
and
. The activation functions of the encoder and decoder are denoted by
and
, respectively. Typical activation functions include tanh, ReLU, and linear mappings. The full encoder is parameterized by
.
Denoising Autoencoders extend the conventional Autoencoder architecture and can be interpreted as a nonlinear generalization of Principal Component Analysis [
34]. However, the feature extraction capacity of a single-layer DAE is inherently limited. To address this, the hidden representation from one DAE is used as the input to the next, forming a multi-layer architecture known as the Stacked Denoising Autoencoder [
35]. Two implementation strategies are possible, depending on the availability of clean and noisy data. If both are available, the SDAE can be structured as a standard neural network comprising separate encoding and decoding stages. When only clean data are available, a noise injection layer can be employed to artificially corrupt the input using additive white Gaussian noise (AWGN) with zero mean and unit variance, as shown in
Figure 8. This approach is motivated by the assumption that measurement noise in plant sensors follows a white, zero-mean Gaussian distribution, which, after normalization, has unit variance [
36]. The SDAE and GRU are trained end to end, the parameters of the GRU will be specified in the relevant section later, while this section only presents the parameters related to the SDAE in
Table 1.
3.3.2. Gated Recurrent Unit
Given that the measurements obtained from the pilot plant form time-series data, a Gated Recurrent Unit (GRU) network is adopted as the learning model in this study. The GRU is a streamlined variant of the Long Short-Term Memory (LSTM) architecture [
37], offering comparable predictive capability while requiring considerably fewer computational resources [
38]. This characteristic makes the GRU particularly suitable for real-time MHE applications, where lightweight models are preferred due to strict computational constraints.
Compared with more complex deep learning architectures, GRUs provide an effective compromise between representational power and computational efficiency, allowing them to capture temporal dependencies without excessive model size. This balance also motivates their use in our ongoing development of control-oriented frameworks, including model predictive control designs where real-time feasibility is essential. In a GRU cell, the hidden state is used to carry temporal information across time steps, and its internal dynamics are governed by two gating mechanisms: the reset gate and the update gate. The structure of a GRU unit is illustrated in
Figure 9.
Calculations in the GRU cell are listed as follow, the reset gate
(sigmoid layer):
the update gate
(sigmoid layer):
the candidate hidden state
(tanh layer):
hidden state:
where
,
,
,
,
, and
denote the weight matrices, while
,
, and
are the corresponding bias vectors. The operator ⊙ represents the Hadamard (elementwise) product. The GRU models used for the drying and regeneration processes are first trained in Python using Keras/TensorFlow [
39], with their hyperparameters tuned via Optuna [
40]. After training, the resulting network is exported to MATLAB (2023a) using the Deep Learning Toolbox Converter [
41].
The GRU architecture preserves temporal information by carrying past state information through its hidden states. Based on sequential measurements and previous observations, the model predicts the CO
2 concentration at each time step. The network architecture consists of an input layer followed by two stacked GRU layers. Dropout layers are inserted between these recurrent layers to mitigate overfitting by randomly deactivating a subset of neurons during training. The detailed configuration of the GRU model is summarized in
Table 2.
3.4. Data Fusion Using the Hybrid Model
Data assimilation (DA) provides a framework for combining model-based predictions with process measurements to obtain an improved estimate of the system state [
42]. In this study, we employ an approach similar to the update stage of a Kalman-type estimator, in which information from both the model outputs and the measured values is merged. The fusion is achieved by weighting the contributions of the mechanistic model and the machine learning model according to their respective error covariance matrices.
represent the covariance matrices associated with the mechanistic model and the SDAE–GRU model, respectively, and quantify the uncertainty in each model’s predicted concentration vector
. The subscripts “mec’’ and “GRU’’ indicate the outputs of the mechanistic and data-driven components. In dynamic systems, the predictive accuracy of these models may fluctuate over time, implying that their error covariances are inherently time-varying. Because the available process data in this study are limited,
and
are treated as constant matrices rather than time-dependent quantities. Their values are determined by comparing the model-generated predictions with the optimal MHE-based estimates obtained from historical data.
The Gaspari–Cohn function is adopted as the covariance localization operator to suppress artificial long-range correlations in the error statistics. In practice, measurements at one spatial location typically exhibit meaningful correlation only with nearby points, while correlations with points farther upstream or downstream are negligible. The characteristic correlation length
L specifies the radius over which adjacent measurements influence each other. The localized correlation coefficient, expressed as a function of the normalized distance
, is given by
with
where
indicate the indices of the entries
and
, and
L denotes the correlation length, which is set to 2 in this work. Although the matrices
and
are not updated in real time, incorporating such time-varying behavior into the framework would be straightforward. Let
represent the number of data records used to compute these covariance matrices, and
and
denote the outputs of the mechanistic and GRU-based models, respectively. A summary of the SDAE-based dimensionality reduction procedure is given in Algorithm 1. Because direct linear interpolation of CO
2 readings between the present measurement and future measurements is not feasible, the most recent estimates of
and
used in Algorithm 1 are obtained from earlier cycles of process data.
| Algorithm 1 Pesudo code of the whole control stragety |
- 1:
Inputs: Initial records sequence - 2:
Encode original records sequence: - 3:
Initialize matrices and : , - 4:
Initialize current time step: - 5:
Initialize update window size: - 6:
Initialize the last update time step: - 7:
Record mechanistic model prediction: - 8:
Record data-driven model prediction: - 9:
Record MHE estimation: - 10:
if
then - 11:
Compute mechanistic model deviation: - 12:
- 13:
Compute GRU model deviation: - 14:
- 15:
Update mechanistic model covariance matrix: - 16:
- 17:
Update GRU model covariance matrix: - 18:
- 19:
- 20:
end if - 21:
) - 22:
- 23:
Output: Corrected prediction.
|
4. Implementation and Results
For the model development, training and parameter optimization of the mechanistic model, MHE, SDAE-GRU, and the hybrid model, datasets 1–5 were used for model training and parameter tuning, while datasets 6–7 were used for validation. The whole framework is implemented as
Figure 10.
4.1. Mechanistic Model Validation
For the validation of the mechanistic model, the historical measurements from dataset 5 are used. First, the column temperature profiles at sampling points 1 to 5, which provide continuous and complete measurements, were used for the model validation. As illustrated in
Figure 11, the blue line denotes the real measurement, and the red line denotes the predictions from the standalone mechanistic model. The overall prediction is reasonably accurate; however, due to the lack of an accurate initial estimate for the internal system states, the initial states were optimized within a reasonable range. As a result, the model exhibits a certain degree of deviation at the beginning of the prediction period.
To correct the deviation caused by the model inaccuracies and measurement noise, methods such as SDAE-GRU, MHE, and the hybrid model are employed.
4.2. State Estimation Results
The real plant provides measurements at 43-s intervals, and therefore the mechanistic model and the MHE estimator are both operated using this same sampling period. The estimation horizon is selected as
time steps to remain consistent with the sequence length used in the GRU model. The nonlinear optimization subproblems that arise in the two moving-horizon estimators are solved using IPOPT through its CASADI interface [
43], which offers computation times suitable for online implementation of the state estimation framework.
Due to physical constraints of the system, several key states in the carbon capture process are restricted within predefined bounds. This section demonstrates the performance of the proposed estimation approach under these assumptions. The covariance matrices Q, R, and P are all chosen to be diagonal. The process disturbance covariance, measurement noise covariance, and state error covariance for the MHE are specified as Q = diag([1 0.5 0.0001 … 0.5 0.0001 1 0.5 0.0001]), R = diag([0.1 0.1 0.1 0.001 0.1]) and P = diag([1 1 1 1 1 … 1 1 1 1]) ×.
Based on the mechanistic model presented in
Section 3.1, a Moving Horizon Estimation (MHE) framework was developed to refine the estimation of carbon dioxide concentration using temperature measurement data. The performance of the proposed MHE approach was validated using Dataset 6. At the initial stage of operation, a large amount of
enters from the bottom of the column, resulting in an initial increase in
concentration. Subsequently, the concentration decreases and eventually stabilizes, exhibiting a relatively steady trend over time. The results present the estimated
concentration in the bottom stage (stage 1). The result is illustrated in
Figure 12. It can be observed that, after incorporating the measurement corrections, the model’s predictions of carbon dioxide concentration align more closely with the actual measured data points. The MHE approach provides more accurate estimation results than the mechanistic model alone. Therefore, in the subsequent SDAE-GRU framework, the optimal estimations generated by the MHE will be used as target values for training.
As illustrated in
Figure 13, the
concentration profiles of dataset 1–5 are presented, during plant operation, the amount of MEA solvent is significantly greater than the amount of
present; consequently, more than 95% of the
is absorbed in stage 1. As a result, the
volumetric concentrations in the upper stages (stages 2–5) are nearly zero, and their corresponding curves are essentially flat lines close to the x-axis, which do not contribute to validating the predictive capability of the model.
4.3. Prediction Result from the SDAE-GRU
As illustrated in
Figure 14, the data from the dataset 7 were used to evaluate the predictive performance of the GRU model. In the figure, the blue line represents the optimal estimation obtained by MHE, while the red line shows the prediction results of the GRU model. As illustrated, the GRU model is able to predict the
concentration with good accuracy.
4.4. Hybrid-Model Prediction Results for CO2 Concentration
As illustrated in
Figure 15, the previously mentioned covariance-weighted blending and the Gaspari–Cohn method were employed to fuse the data. After fusion, it can be observed that the hybrid model provides more accurate estimates of
concentration. Additionally, it can be seen that both the GRU and the mechanistic model, when used independently, are susceptible to issues such as over- or underestimation and excessively rapid fluctuations, due to input data noise and inherent limitations such as parameter inaccuracies in the mechanistic equations. The hybrid approach effectively integrates the strengths of both models, mitigating their individual weaknesses and enhancing overall prediction performance. The Mean Absolute Percentage Error (MAPE) was used to quantify the model performance. The detailed results of different models of testing data sets are provided in the
Table 3. The error distributions remain relatively consistent across different datasets, as the overall operating conditions of the absorber are stable. When the flue gas initially enters the absorber, the CO
2 concentration rises rapidly to a peak and then gradually decreases and stabilizes as the absorbent is introduced. The obtained results demonstrate the advantages of the proposed state estimation framework. The Wilcoxon signed-rank test is a non-parametric statistical method that evaluates whether the paired differences between two sets of errors are symmetrically distributed around zero, thereby assessing whether one model consistently outperforms another without assuming any specific error distribution [
44]. The Wilcoxon signed-rank test (
p < 0.05) indicates that the proposed method significantly outperforms both the standalone mechanistic model and the GRU-based model. In addition to point-wise error metrics, statistical analyses were conducted to assess the robustness of the results. The Wilcoxon signed-rank test (
p < 0.05) indicates that the proposed method significantly outperforms the standalone mechanistic model and the GRU-based model. Furthermore, bootstrap analysis of the prediction errors shows that the obtained MAE and RMSE values remain stable across different datasets, confirming the robustness of the proposed framework.
5. Discussion
The proposed hybrid mechanistic–MHE–machine learning framework was evaluated using multiple datasets collected from the carbon capture pilot plant.
Figure 12,
Figure 13 and
Figure 14 present representative comparisons between the measured
concentration profiles and the predictions generated by the proposed model. Quantitative performance metrics, including RMSE, MAE, and MAPE, are summarized in
Table 3.
Across all datasets, the proposed framework achieves consistently lower prediction errors compared with the baseline interpolation-based approach. In particular, the reconstructed concentration trajectories exhibit smoother temporal evolution and improved agreement with measured trends, indicating that the estimator effectively mitigates noise and discontinuities in the raw analyzer data.
The observed improvement in prediction accuracy is especially pronounced during transient operating conditions, where concentration measurements are sparse and intermittently available. Under such conditions, linear interpolation fails to capture the true dynamic behavior of the absorber, resulting in physically inconsistent concentration trajectories.
In contrast, the incorporation of Moving Horizon Estimation (MHE) enables the reconstruction of concentration profiles that remain consistent with the underlying process dynamics and physical constraints. By explicitly accounting for system dynamics and measurement uncertainty, the proposed framework provides more realistic state trajectories, which serve as higher-quality training data for the downstream machine learning models. This explains the enhanced robustness and stability observed in the prediction results.
Compared with prior hybrid soft sensing approaches that rely on simple linear interpolation to fill missing concentration measurements, the proposed framework introduces a fundamentally different strategy for data reconstruction. Interpolation-based methods lack physical justification and are unable to represent the actual evolution of concentration under varying operating conditions, particularly during rapid transients.
By contrast, the proposed MHE-based reconstruction explicitly enforces physical consistency through a mechanistic model and external measurements. This difference is reflected in the improved prediction accuracy and reduced sensitivity to measurement sparsity observed in the results. The comparison highlights that the performance gain is not merely due to a more complex learning model, but rather to the improved physical fidelity of the training data.
The results also underline the critical role of MHE within the proposed framework. Without the MHE-based reconstruction step, the machine learning model would be trained directly on interpolated concentration trajectories that are physically unjustified. Previous studies have reported that such distorted training data can lead to degraded prediction accuracy and reduced generalization capability.
The present results suggest that MHE acts as an essential intermediate layer that bridges sparse measurements and data-driven learning, ensuring that the learning model operates on physically meaningful inputs. This role cannot be readily replaced by purely data-driven architectures, particularly in industrial settings where dense and high-precision measurements are difficult to obtain.
Despite the demonstrated advantages, several limitations of the proposed framework should be acknowledged. First, the accuracy of the reconstructed concentration profiles depends on the fidelity of the mechanistic model and the assumption that the system operates near nominal conditions. Significant model mismatch may reduce estimation accuracy. Second, the computational cost of the MHE increases with the length of the estimation horizon, which may limit real-time deployment in large-scale systems. While the covariance update procedure has been simplified to reduce computational burden, further optimization and parallelization strategies may be required for industrial-scale implementation.
6. Conclusions
This paper presents a state estimation framework for predicting the concentration in the absorption column of the carbon capture plant by integrating SDAE, GRU, mechanistic model and moving horizon estimation. Main conclusions are given below:
- 1.
A compact dynamic model of the PCC absorber was developed and validated using pilot-plant data across multiple operating conditions. The datasets were split into training and testing groups. The mechanistic model provides full concentration profiles with an average prediction error of 6.79%.
- 2.
The MHE solution is developed based on the dynamic model. Physical constraints and a disturbance matrix are implemented for both input and output data to further improve the robustness of the system under complex operating conditions. By utilizing measured temperature for external correction, the MHE yields concentration profiles with improved accuracy and robustness compared to using the mechanistic model alone. These optimal estimations will subsequently serve as the basis for constructing the SDAE–GRU model.
- 3.
A Stacked Denoising Autoencoder-Gated Recurrent Unit framework was developed using measured data from the plant as input and the concentrations at five monitoring points as target outputs. The SDAE and GRU components were trained in an end-to-end pattern. Initially, the SDAE was employed to denoise and compress the input data, followed by the GRU for temporal prediction. The prediction error relative to the optimal estimates obtained from Moving Horizon Estimation (MHE) was approximately 7.86%.
- 4.
After developing the two models, their prediction results were fused using a covariance-weighted blending approach, incorporating the covariance matrices of the model prediction errors. Additionally, the Gaspari-Cohn weighting scheme was applied to the error covariance matrices based on spatial distances, in order to mitigate unnecessary interference from distant locations. The fused model achieved a prediction accuracy of approximately 3.79%.
The proposed method establishes a state estimation framework suitable for complex systems with missing data. Since Moving Horizon Estimation (MHE) is applied only during the training phase to provide supervisory signals, it does not affect the real-time performance of the deployed model. Therefore, the resulting state estimation framework is well-suited for subsequent control and further operational optimization of the system.
Author Contributions
S.C.: Writing—review & editing, Writing—original draft, Visualization, Validation, Supervision, Software, Methodology, Formal analysis. S.G.: Methodology, Data curation, Software, Methodology. M.M.: Writing—review & editing, Supervision, Project administration, Methodology, Formal analysis. All authors have read and agreed to the published version of the manuscript.
Funding
This research received no external funding.
Data Availability Statement
The data that has been used is confidential.
Conflicts of Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Nomenclature
| MHE | Moving horizon estimation |
| Carbon dioxide |
| PCC | Post-combustion carbon capture |
| DAE | Denoising autoencoder |
| SDAE | Stacked denoising autoencoder |
| GRU | Gated Recurrent Unit |
| GHGs | Greenhouse gas |
| MEA | Monoethanolamine |
| VLE | Vapor-liquid equilibrium |
| E-AHF | Extended adaptive hybrid functions |
| AI | Artificial intelligence |
| VC | Volumetric concentration |
| DTr | Decision tree regressor |
| SVMr | Support vector machine regressor |
| ANN | Artificial neural network |
| LSTM | Long Short-Term Memory |
| AE | Autoencoder |
| ENKF | Ensemble Kalman filtering |
| KF | Kalman filtering |
| CSTR | Continuous stirred tank reactors |
| AWGN | Additive white Gaussian noise |
| DA | Data assimilation |
| MAPE | Mean absolute percentage error |
| a,b,c | Reaction rate parameters |
| Reaction rate coefficient |
| Reaction rate |
| MEA concentration |
| Carbon dioxide concentration |
| Temperature change within a stage |
| Hold up of liquid phase |
| Hold up of gas phase |
| absorption heat |
| Heat capacity of solvent (kJ/kg·°C) |
| Heat capacity of gas (kJ/kg·°C) |
| Temperature of liquid phase (°C) |
| Temperature of gas phase (°C) |
| Liquid volume |
| Gas volume |
| i | Number of stages i = {1,2,3,4,5} |
References
- Koyama, K. 2019 Global Energy Situation Indicated by BP Statistics. 2020. Available online: https://scholar.google.com/scholar?hl=zh-CN&as_sdt=0%2C5&q=K.+Koyama%2C+2019+global+energy+situation+indicated+by+bp+statistics+%282020%29.&btnG= (accessed on 30 January 2026).
- Fu, S.; Zou, J.; Zhang, X.; Qi, Y. Review on the latest conclusions of working group III contribution to the fifth assessment report of the intergovernmental panel on climate change. Chin. J. Urban Environ. Stud. 2015, 3, 1550005. [Google Scholar] [CrossRef]
- Newell, R.; Raimi, D.; Villanueva, S.; Prest, B. Global energy outlook 2021: Pathways from Paris. Resour. Future 2021, 8, 39. [Google Scholar]
- Heuberger, C.F.; Staffell, I.; Shah, N.; Mac Dowell, N. Quantifying the value of CCS for the future electricity system. Energy Environ. Sci. 2016, 9, 2497–2510. [Google Scholar] [CrossRef]
- Luis, P. Use of monoethanolamine (MEA) for CO2 capture in a global scenario: Consequences and alternatives. Desalination 2016, 380, 93–99. [Google Scholar] [CrossRef]
- Kittel, J.; Idem, R.; Gelowitz, D.; Tontiwachwuthikul, P.; Parrain, G.; Bonneau, A. Corrosion in MEA units for CO2 capture: Pilot plant studies. Energy Procedia 2009, 1, 791–797. [Google Scholar] [CrossRef]
- Liu, T.; Tian, Z.; Chen, S.; Wang, K.; Harris, C.J. Deep Cascade Gradient RBF Networks With Output-Relevant Feature Extraction and Adaptation for Nonlinear and Nonstationary Processes. IEEE Trans. Cybern. 2022, 53, 4908–4922. [Google Scholar] [CrossRef]
- Zhang, Y.; Chen, H.; Chen, C.C.; Plaza, J.M.; Dugas, R.; Rochelle, G.T. Rate-based process modeling study of CO2 capture with aqueous monoethanolamine solution. Ind. Eng. Chem. Res. 2009, 48, 9233–9246. [Google Scholar] [CrossRef]
- Mahapatra, P.; Ma, J.; Ng, B.; Bhattacharyya, D.; Zitney, S.E.; Miller, D.C. Integrated dynamic modeling and advanced process control of carbon capture systems. Energy Procedia 2014, 63, 1354–1367. [Google Scholar] [CrossRef]
- Zhang, Y.; Chen, C.C. Modeling CO2 absorption and desorption by aqueous monoethanolamine solution with Aspen rate-based model. Energy Procedia 2013, 37, 1584–1596. [Google Scholar] [CrossRef]
- Wang, Q.; Zheng, C.; Wu, X.; Wang, M. Robust monitoring of solvent based carbon capture process using deep learning network based moving horizon estimation. Fuel 2022, 321, 124071. [Google Scholar] [CrossRef]
- Hanzelik, P.P.; Kummer, A.; Ipkovich, Á.; Abonyi, J. Fusion and integrated correction of chemometrics and machine learning models based on data reconciliation. Comput. Aided Chem. Eng. 2023, 52, 1379–1384. [Google Scholar]
- Morgan, J.C.; Chinen, A.S.; Omell, B.; Bhattacharyya, D.; Tong, C.; Miller, D.C. Thermodynamic modeling and uncertainty quantification of CO2-loaded aqueous MEA solutions. Chem. Eng. Sci. 2017, 168, 309–324. [Google Scholar] [CrossRef]
- Putta, K.R.; Svendsen, H.F.; Knuutila, H.K. CO2 absorption into loaded aqueous MEA solutions: Impact of different model parameter correlations and thermodynamic models on the absorption rate model predictions. Chem. Eng. J. 2017, 327, 868–880. [Google Scholar] [CrossRef]
- Xing, L.; Jiang, H.; Tian, X.; Yin, H.; Shi, W.; Yu, E.; Pinfield, V.J.; Xuan, J. Combining machine learning with multi-physics modelling for multi-objective optimisation and techno-economic analysis of electrochemical CO2 reduction process. Carbon Capture Sci. Technol. 2023, 9, 100138. [Google Scholar] [CrossRef]
- Di Caprio, U.; Wu, M.; Vermeire, F.; Van Gerven, T.; Hellinckx, P.; Waldherr, S.; Kayahan, E.; Leblebici, M.E. Predicting overall mass transfer coefficients of CO2 capture into monoethanolamine in spray columns with hybrid machine learning. J. CO2 Util. 2023, 70, 102452. [Google Scholar] [CrossRef]
- Tian, Z.; Gu, Y.; Bolat, P.; Zhang, Y.; Gao, W. Prediction and multi-objective optimization of pilot-scale carbon capture system based on multi-source monitoring information and novel data-driven model. Energy Convers. Manag. 2026, 350, 120937. [Google Scholar] [CrossRef]
- Jiang, Y.; Mao, Z. A novel carbon emission monitoring method for power generation enterprises based on hybrid transformer model. Sci. Rep. 2025, 15, 2598. [Google Scholar] [CrossRef]
- Nikoofard, A.; Johansen, T.A.; Molaei, A. Reservoir characterization in under-balanced drilling with nonlinear moving horizon estimation with manual and automatic control conditions. J. Pet. Sci. Eng. 2020, 192, 107248. [Google Scholar] [CrossRef]
- Liu, S.; Yin, X.; Liu, J. State estimation of a carbon capture process through POD model reduction and neural network approximation. arXiv 2023, arXiv:2304.05514. [Google Scholar] [CrossRef]
- Zhang, W.; Wang, Z.; Zou, C.; Drugge, L.; Nybacka, M. Advanced vehicle state monitoring: Evaluating moving horizon estimators and unscented Kalman filter. IEEE Trans. Veh. Technol. 2019, 68, 5430–5442. [Google Scholar] [CrossRef]
- Andersson, L.E.; Scibilia, F.; Imsland, L. An estimation-forecast set-up for iceberg drift prediction. Cold Reg. Sci. Technol. 2016, 131, 88–107. [Google Scholar] [CrossRef]
- Wang, C.; Zhao, W.; Luan, Z.; Gao, Q.; Deng, K. Decoupling control of vehicle chassis system based on neural network inverse system. Mech. Syst. Signal Process. 2018, 106, 176–197. [Google Scholar] [CrossRef]
- Yuan, X.; Huang, B.; Wang, Y.; Yang, C.; Gui, W. Deep learning-based feature representation and its application for soft sensor modeling with variable-wise weighted SAE. IEEE Trans. Ind. Inform. 2018, 14, 3235–3243. [Google Scholar] [CrossRef]
- Vincent, P.; Larochelle, H.; Bengio, Y.; Manzagol, P.A. Extracting and composing robust features with denoising autoencoders. In Proceedings of the 25th International Conference on Machine Learning, Helsinki, Finland, 5–9 July 2008; pp. 1096–1103. [Google Scholar]
- Chen, S.; Jiang, Q. Distributed Robust Process Monitoring Based on Optimized Denoising Autoencoder With Reinforcement Learning. IEEE Trans. Instrum. Meas. 2022, 71, 3503411. [Google Scholar] [CrossRef]
- Erhan, D.; Courville, A.; Bengio, Y.; Vincent, P. Why does unsupervised pre-training help deep learning? In Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, Sardinia, Italy, 13–15 May 2010; pp. 201–208. [Google Scholar]
- Liu, P.; Zheng, P.; Chen, Z. Deep learning with stacked denoising auto-encoder for short-term electric load forecasting. Energies 2019, 12, 2445. [Google Scholar] [CrossRef]
- Rai, A.; Shrivastava, A.; Jana, K.C. A robust auto encoder-gated recurrent unit (AE-GRU) based deep learning approach for short term solar power forecasting. Optik 2022, 252, 168515. [Google Scholar] [CrossRef]
- Roh, S.; Jun, M.; Szunyogh, I.; Genton, M.G. Multivariate localization methods for ensemble Kalman filtering. Nonlinear Process. Geophys. 2015, 22, 723–735. [Google Scholar] [CrossRef]
- Stanley, Z.; Grooms, I.; Kleiber, W. Multivariate localization functions for strongly coupled data assimilation in the bivariate Lorenz’96 system. Nonlinear Process. Geophys. Discuss. 2021, 28, 565–583. [Google Scholar] [CrossRef]
- Zhuang, Y.; Liu, Y.; Ahmed, A.; Zhong, Z.; del Rio Chanona, E.A.; Hale, C.P.; Mercangöz, M. A hybrid data-driven and mechanistic model soft sensor for estimating CO2 concentrations for a carbon capture pilot plant. Comput. Ind. 2022, 143, 103747. [Google Scholar] [CrossRef]
- Liu, J. Moving horizon state estimation for nonlinear systems with bounded uncertainties. Chem. Eng. Sci. 2013, 93, 376–386. [Google Scholar] [CrossRef]
- Goodfellow, I.; Bengio, Y.; Courville, A.; Bengio, Y. Deep Learning; MIT Press: Cambridge, UK, 2016; Volume 1. [Google Scholar]
- Vincent, P.; Larochelle, H.; Lajoie, I.; Bengio, Y.; Manzagol, P.A.; Bottou, L. Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion. J. Mach. Learn. Res. 2010, 11, 3371–3408. [Google Scholar]
- Alex, J.; Benedetti, L.; Copp, J.; Gernaey, K.; Jeppsson, U.; Nopens, I.; Pons, M.N.; Rieger, L.; Rosen, C.; Steyer, J.; et al. Benchmark Simulation Model No. 1 (BSM1); IWA Publishing: London, UK, 2008. [Google Scholar]
- Cho, K.; Van Merriënboer, B.; Bahdanau, D.; Bengio, Y. On the properties of neural machine translation: Encoder-decoder approaches. arXiv 2014, arXiv:1409.1259. [Google Scholar] [CrossRef]
- Chung, J.; Gulcehre, C.; Cho, K.; Bengio, Y. Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv 2014, arXiv:1412.3555. [Google Scholar] [CrossRef]
- Abadi, M.; Agarwal, A.; Barham, P.; Brevdo, E.; Chen, Z.; Citro, C.; Corrado, G.S.; Davis, A.; Dean, J.; Devin, M.; et al. TensorFlow: Large-Scale Machine Learning on Heterogeneous Systems. 2015. Available online: https://www.tensorflow.org/ (accessed on 30 January 2026).
- Akiba, T.; Sano, S.; Yanase, T.; Ohta, T.; Koyama, M. Optuna: A Next-generation Hyperparameter Optimization Framework. arXiv 2019, arXiv:1907.10902. [Google Scholar] [CrossRef]
- Paluszek, M.; Thomas, S.; Paluszek, M.; Thomas, S. MATLAB machine learning toolboxes. In Practical MATLAB Deep Learning: A Project-Based Approach; Apress: New York, NY, USA, 2020; pp. 25–41. [Google Scholar]
- Cheng, S.; Lucor, D.; Argaud, J.P. Observation data compression for variational assimilation of dynamical systems. J. Comput. Sci. 2021, 53, 101405. [Google Scholar] [CrossRef]
- Andersson, J.A.E.; Gillis, J.; Horn, G.; Rawlings, J.B.; Diehl, M. CasADi—A software framework for nonlinear optimization and optimal control. Math. Program. Comput. 2019, 11, 1–36. [Google Scholar] [CrossRef]
- Divine, G.; Norton, H.J.; Hunt, R.; Dienemann, J. A review of analysis and sample size calculation considerations for Wilcoxon tests. Anesth. Analg. 2013, 117, 699–710. [Google Scholar] [CrossRef] [PubMed]
Figure 1.
Simplified flowsheet of the post-combustion carbon capture (PCC) pilot plant, including the absorption–regeneration loop and the gas sampling points along the absorption column.
Figure 1.
Simplified flowsheet of the post-combustion carbon capture (PCC) pilot plant, including the absorption–regeneration loop and the gas sampling points along the absorption column.
Figure 2.
The overall structure (left) and the top stage of the PCC pilot plant (right).
Figure 2.
The overall structure (left) and the top stage of the PCC pilot plant (right).
Figure 3.
Images of the CO2 analyzer sampling system, highlighting the sampling locations and the measurement setup used for CO2 concentration acquisition.
Figure 3.
Images of the CO2 analyzer sampling system, highlighting the sampling locations and the measurement setup used for CO2 concentration acquisition.
Figure 4.
Illustration of the datasets covering different operating regions represented over gas-liquid volumetric ratio and inlet concentration.
Figure 4.
Illustration of the datasets covering different operating regions represented over gas-liquid volumetric ratio and inlet concentration.
Figure 5.
A snapshot of the CO2 concentration measurement record, illustrating the intermittent and discontinuous nature of the analyzer data and the selected zoomed-in segment.
Figure 5.
A snapshot of the CO2 concentration measurement record, illustrating the intermittent and discontinuous nature of the analyzer data and the selected zoomed-in segment.
Figure 6.
A plot of two CO2 analyzer measurement cycles, illustrating the discrete and intermittent nature of the concentration measurements over time.
Figure 6.
A plot of two CO2 analyzer measurement cycles, illustrating the discrete and intermittent nature of the concentration measurements over time.
Figure 7.
The mechanistic model is divided into five stages which stands for the five packing structures in the absorber column, flue gas flows from the bottom (right side) and MEA solution flows from the top (left side).
Figure 7.
The mechanistic model is divided into five stages which stands for the five packing structures in the absorber column, flue gas flows from the bottom (right side) and MEA solution flows from the top (left side).
Figure 8.
The architecture of the SDAE-GRU model.
Figure 8.
The architecture of the SDAE-GRU model.
Figure 9.
The model structure of a GRU cell.
Figure 9.
The model structure of a GRU cell.
Figure 10.
Overview of theproposed hybrid methodology for CO2 concentration prediction, illustrating the offline training stage and the online deployment framework.
Figure 10.
Overview of theproposed hybrid methodology for CO2 concentration prediction, illustrating the offline training stage and the online deployment framework.
Figure 11.
The comparison of the mechanistic model prediction and the real measurement of the temperature profiles at 1–5 sampling points of dataset 6.
Figure 11.
The comparison of the mechanistic model prediction and the real measurement of the temperature profiles at 1–5 sampling points of dataset 6.
Figure 12.
Comparison of CO2 concentration profiles at the bottom sampling layer of Dataset 6, including mechanistic model prediction, MHE estimation, and real measurement points.
Figure 12.
Comparison of CO2 concentration profiles at the bottom sampling layer of Dataset 6, including mechanistic model prediction, MHE estimation, and real measurement points.
Figure 13.
Comparison of CO2 concentration profiles at all five packing layers of Dataset 6, including mechanistic model prediction, MHE estimation, and real measurements.
Figure 13.
Comparison of CO2 concentration profiles at all five packing layers of Dataset 6, including mechanistic model prediction, MHE estimation, and real measurements.
Figure 14.
Comparison of CO2 concentration profiles at the bottom sampling layer of Dataset 7 between the GRU prediction and the MHE estimation.
Figure 14.
Comparison of CO2 concentration profiles at the bottom sampling layer of Dataset 7 between the GRU prediction and the MHE estimation.
Figure 15.
Comparison of CO2 concentration profiles of Dataset 7, including mechanistic model prediction, fused estimation results, GRU prediction, and real measurements.
Figure 15.
Comparison of CO2 concentration profiles of Dataset 7, including mechanistic model prediction, fused estimation results, GRU prediction, and real measurements.
Table 1.
The detailed setup of the SDAE model.
Table 1.
The detailed setup of the SDAE model.
| Parameters | Values |
|---|
| Input shape | 90 |
| Neurons in the first layer | 32 |
| Neurons in the second layer | 16 |
| Latent space dimension | 8 |
| Dropout rate | 0.1075 |
| MaxEpochs | 150 |
| Mini batch size | 32 |
| Learning Rate | 0.0006785 |
| Optimization algorithm | Adam |
| Training dataset | Dataset 1–5 |
| Testing dataset | Dataset 6–7 |
| Input features | The measurement data from all sensors |
Table 2.
The detailed setup of the GRU model.
Table 2.
The detailed setup of the GRU model.
| Parameters | Values |
|---|
| Input shape | 10 |
| Sequence length | 20 |
| Number of neurons | 44 |
| Dropout rate | 0.1075 |
| Epochs | 150 |
| Mini batch size | 32 |
| Learning rate | 0.0006785 |
| Optimization algorithm | Adam |
| Training dataset | Dataset 1–5 |
| Testing dataset | Dataset 6–7 |
| Input features | The compressed features from SDAE |
| Output features | concentration at 1–5 sampling points |
Table 3.
The model’s performance.
Table 3.
The model’s performance.
| Model | Dataset | MAPE | MAE | RMSE |
|---|
| Mechanistic model | all | 6.79% | 0.0041 | 0.0043 |
| GRU model | all | 7.86% | 0.005 | 0.0054 |
| Hybrid model | all | 3.79% | 0.00224 | 0.003 |
| Mechanistic model | 6 | 6.64% | 0.0039 | 0.0043 |
| GRU model | 6 | 8.52% | 0.0057 | 0.0062 |
| Hybrid model | 6 | 4.3% | 0.00267 | 0.0032 |
| Mechanistic model | 7 | 6.94% | 0.0046 | 0.0052 |
| GRU model | 7 | 7.2% | 0.0047 | 0.0052 |
| Hybrid model | 7 | 3.28% | 0.00181 | 0.0021 |
| Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |