Variable Structure Learning-Based Spatio-Temporal Graph Convolutional Networks for Chemical Process Quality Prediction with SHAP-Enhanced Interpretability
Abstract
1. Introduction
2. Preliminaries
2.1. Graph Convolutional Neural Networks (GCNs)
2.2. Graph Attention Mechanism
3. Methodology
3.1. Spatial Temporal GCN with Variable Structure Learning (VSL)
3.2. Variable Importance Based on SHAP Algorithm
4. Soft Sensing of Quality Variable Based on the VSL-STGCN
4.1. VSL-STGCN Model Structure
- (1)
- Process variable selection module
- (2)
- Embedding Mapping Layer
- (3)
- Variable Graph Structure Learning Module
- (4)
- Graph Attention Network
- (5)
- Residual Connection
- (6)
- STGCN Module
4.2. Explanation of the End-to-End Gradient Descent Algorithm Used for Training
- (1)
- Initialization
- (2)
- Forward Propagation
- (3)
- Loss Calculation
- (4)
- Backpropagation
- (5)
- Parameter Update
- (6)
- Iteration
| Algorithm 1: Training Process of the VSL-STGCN Model |
| Input: Process data , where . Quality variable data . Number of epochs , learning rate . Output: A trained VSL-STGCN model with optimized parameters . /* 1. Variable Selection (Offline) */ 1: Calculate feature importance for all variables in using SHAP algorithm. 2: Select top-k most important variables to form the input data . /* 2. Model Initialization */ 3: Initialize model parameters , including node embedding matrices , , and weights for GNN modules. /* 3. End-to-End Training */ 4: for epoch = 1 to do /* Forward Propagation */ 5: // Variable Graph Structure Learning 6: ← 7: ← 8: // Spatio-Temporal Feature Extraction 9: ← 10: ← 11: ← 12: ← /* Loss Calculation */ 13: Loss ← /* Backpropagation and Parameter Update */ 14: Compute gradients . 15: Update all trainable parameters (including , ) using gradient descent: 16: ← 17: end for 18: return Trained model with parameters . |
5. Case Studies
5.1. The High-Low Transformer Unit Case
5.2. The Pre-Decarburization Unit Case
6. Conclusions and Future Work
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Lyu, Y.; Zhou, L.; Cong, Y.; Zheng, H.; Song, Z. Multirate mixture probability principal component analysis for process monitoring in multimode processes. IEEE Trans. Autom. Sci. Eng. 2023, 21, 2027–2038. [Google Scholar] [CrossRef]
- Zhai, R.; Zhang, X.; Song, Z.; Kano, M. Enhancing reliability of data-driven soft sensors with stable loss function and sample graph. Comput. Chem. Eng. 2025, 202, 109303. [Google Scholar] [CrossRef]
- Jia, M.; Yang, C.; Pan, Z.; Liu, Q.; Liu, Y. Adversarial relationship graph learning soft sensor via negative information exclusion. J. Process Control 2025, 145, 103354. [Google Scholar] [CrossRef]
- Shen, B.; Jiang, X.; Yao, L.; Zeng, J. Gaussian mixture TimeVAE for industrial soft sensing with deep time series decomposition and generation. J. Process Control 2025, 147, 103355. [Google Scholar] [CrossRef]
- Yeo, W.S.; Saptoro, A.; Kumar, P.; Kano, M. Just-in-time based soft sensors for process industries: A status report and recommendations. J. Process Control 2023, 128, 103025. [Google Scholar] [CrossRef]
- Rittig, J.G.; Ben Hicham, K.; Schweidtmann, A.M.; Dahmen, M.; Mitsos, A. Graph neural networks for temperature-dependent activity coefficient prediction of solutes in ionic liquids. Comput. Chem. Eng. 2023, 171, 108153. [Google Scholar] [CrossRef]
- Guo, J.; Sun, M.; Zhao, X.; Shi, C.; Su, H.; Guo, Y.; Pu, X. General graph neural network-based model to accurately predict cocrystal density and insight from data quality and feature representation. J. Chem. Inf. Model. 2023, 63, 1143–1156. [Google Scholar] [CrossRef]
- Ahmed, M.J.; Mozo, A.; Karamchandani, A. A survey on graph neural networks, machine learning and deep learning techniques for time series applications in industry. PeerJ Comput. Sci. 2025, 11, e3097. [Google Scholar] [CrossRef]
- Simão, C.; Hugo, T. GNN-Representation Enabled Adaptive Weighting Algorithm for Mechanical Performance Tuning. J. Comput. Methods Eng. Appl. 2024, 4, 1–15. [Google Scholar] [CrossRef]
- Bui, K.H.N.; Cho, J.; Yi, H. Spatial-temporal graph neural network for traffic forecasting: An overview and open research issues. Appl. Intell. 2022, 52, 2763–2774. [Google Scholar] [CrossRef]
- Guo, S.; Lin, Y.; Wan, H.; Li, X.; Cong, G. Learning dynamics and heterogeneity of spatial-temporal graph data for traffic forecasting. IEEE Trans. Knowl. Data Eng. 2021, 34, 5415–5428. [Google Scholar] [CrossRef]
- Gupta, V.; Liao, W.-K.; Choudhary, A.; Agrawal, A. Combining transfer learning and representation learning to improve predictive analytics on small materials data. In Proceedings of the 2024 International Conference on Machine Learning and Applications (ICMLA), Miami, FL, USA, 18–20 December 2024; pp. 981–984. [Google Scholar] [CrossRef]
- Wang, X.; Zhang, H.; Zhang, Y.; Wang, M.; Song, J.; Lai, T.; Khushi, M. Learning nonstationary time-series with dynamic pattern extractions. IEEE Trans. Artif. Intell. 2022, 3, 778–787. [Google Scholar] [CrossRef]
- Kim, C.S.; Kim, H.B.; Lee, J.M. Self-Explanatory Fault Diagnosis Framework for Industrial Processes Using Graph Attention. IEEE Trans. Ind. Inform. 2025, 21, 3396–3405. [Google Scholar] [CrossRef]
- Lin, J.; Ma, J.; Zhu, J.; Cui, Y. Short-term load forecasting based on LSTM networks considering attention mechanism. Int. J. Electr. Power Energy Syst. 2022, 137, 107818. [Google Scholar] [CrossRef]
- Omuya, E.O.; Okeyo, G.O.; Kimwele, M.W. Feature selection for classification using principal component analysis and information gain. Expert Syst. Appl. 2021, 174, 114765. [Google Scholar] [CrossRef]
- Alsahaf, A.; Petkov, N.; Shenoy, V.; Azzopardi, G. A framework for feature selection through boosting. Expert Syst. Appl. 2022, 187, 115895. [Google Scholar] [CrossRef]
- Graziani, S.; Xibilia, M.G. Multiple correlation analysis for finite-time delay estimation for soft sensor design in the presence of noise. IEEE Trans. Instrum. Meas. 2023, 72, 3307748. [Google Scholar] [CrossRef]
- Ribeiro, M.T.; Singh, S.; Guestrin, C. “Why should I trust you?” Explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; pp. 1135–1144. [Google Scholar] [CrossRef]
- Qi, Z.; Khorram, S.; Fuxin, L. Visualizing Deep Networks by Optimizing with Integrated Gradients. In Proceedings of the AAAI Conference on Artificial Intelligence, New York, NY, USA, 7–12 February 2020; Volume 34, pp. 11890–11898. [Google Scholar] [CrossRef]
- Winkler, N.P.; Neumann, P.P.; Albizu, N.; Schaffernicht, E.; Lilienthal, A.J. GNN-DM: A Graph Neural Network Framework for Real-World Gas Distribution Mapping. IEEE Sens. J. 2025, 25, 42171–42179. [Google Scholar] [CrossRef]
- Phan, H.T.; Nguyen, N.T.; Hwang, D. Aspect-level sentiment analysis: A survey of graph convolutional network methods. Inf. Fusion 2023, 91, 149–172. [Google Scholar] [CrossRef]
- Sun, C.; Li, C.; Lin, X.; Zheng, T.; Meng, F.; Rui, X.; Wang, Z. Attention-based graph neural networks: A survey. Artif. Intell. Rev. 2023, 56 (Suppl. 2), 2263–2310. [Google Scholar] [CrossRef]
- Shao, P.; He, J.; Li, G.; Zhang, D.; Tao, J. Hierarchical graph attention network for temporal knowledge graph reasoning. Neurocomputing 2023, 550, 126390. [Google Scholar] [CrossRef]
- Antwarg, L.; Miller, R.M.; Shapira, B.; Rokach, L. Explaining anomalies detected by autoencoders using Shapley Additive Explanations. Expert Syst. Appl. 2021, 186, 115736. [Google Scholar] [CrossRef]
- Yao, L.; Yang, Z.; Zhang, Z.; Tang, S.; Shen, B.; Zeng, J. Input factor selection based on interpretable neural network for industrial virtual sensing application. IEEE Trans. Instrum. Meas. 2023, 72, 3323006. [Google Scholar] [CrossRef]
- Li, Y.; Han, W.; Shao, W.; Zhao, D. Virtual sensing for dynamic industrial process based on localized linear dynamical system models with time-delay optimization. ISA Trans. 2023, 133, 505–517. [Google Scholar] [CrossRef]
- Yao, L.; Ge, Z. Distributed parallel deep learning of hierarchical extreme learning machine for multimode quality prediction with big process data. Eng. Appl. Artif. Intell. 2019, 81, 450–465. [Google Scholar] [CrossRef]
- Zhou, L.; Zheng, J.; Ge, Z.; Song, Z.; Shan, S. Multimode process monitoring based on switching autoregressive dynamic latent variable model. IEEE Trans. Ind. Electron. 2018, 65, 8184–8194. [Google Scholar] [CrossRef]



















| Tags | Descriptions | Tags | Descriptions |
|---|---|---|---|
| U1 | Flowrate to HTT | U14 | Temp. of HTT down level |
| U2 | Content of Ar to HTT | U15 | Pressure at the exit of LTT |
| U3 | Content of CO to HTT | U16 | Exit process gas temp. of HTT |
| U4 | Content of CH4 to HTT | U17 | Temp. of BFW at E2 |
| U5 | Content of H2 to HTT | U18 | Exit process gas temp. of E2 |
| U6 | Flowrate to LTT | U19 | Temp. of LTT up level |
| U7 | Content of Ar to LTT | U20 | Temp. of LTT middle level |
| U8 | Content of CO2 to LTT | U21 | Temp. of LTT down level |
| U9 | Content of CH4 to LTT | U22 | Level of E3 |
| U10 | Content of H2 to LTT | U23 | Pressure of process gas of exit |
| U11 | Content of N2 to LTT | U24 | Exit process gas temp. of LTT |
| U12 | Temp. of HTT up level | U25 | Temp. of recycled N2 at condenser |
| U13 | Temp. of HTT middle level | U26 | Entrance process gas temp. of LTT |
| Hyper-Parameters | Value | Tags | Descriptions |
|---|---|---|---|
| Graph-Embedding Dim | 16 | Channels | 3, 3 |
| Node-Embedding Dim | 8 | Convolution Kernels | 4, 4 |
| Heads | 8 | Convolution Stride | 1, 1 |
| Nodes | 26 | Pooling Kernels | 2, 2 |
| Learning Rate | 0.001 | Epoch Num. | 200 |
| Batch Size | 256 | Optimizer | Adam |
| Index\ Models | VSL-STGCNv1 | VSL-STGCNv2 | VSL-STGCNv3 | STGCN | GCN | LSTM |
|---|---|---|---|---|---|---|
| Num. of Paras. | 15,638 | 12,445 | 9114 | 7091 | 1253 | 745 |
| RMSE | 0.00381 (±0.0006) | 0.00345 (±0.0004) | 0.00317 (±0.0002) | 0.00392 (±0.0012) | 0.00425 (±0.0016) | 0.00484 (±0.0010) |
| MAE | 0.00317 (±0.0011) | 0.00298 (±0.0007) | 0.00279 (±0.0004) | 0.00326 (±0.0014) | 0.00375 (±0.0013) | 0.00413 (±0.0013) |
| R2 | 0.6453 (±0.0210) | 0.6972 (±0.0171) | 0.7740 (±0.0123) | 0.6277 (±0.0211) | 0.5982 (±0.0118) | 0.5250 (±0.0125) |
| Training Time (s) | 646 (±12) | 559 (±10) | 475 (±10) | 373 (±6) | 216 (±5) | 94 (±5) |
| Tags | Descriptions | Tags | Descriptions |
|---|---|---|---|
| U1 | Flow-rate of Feed Natural Gas | U11 | Temperature of Process Gas at Absorption Column |
| U2 | Level of Feed Gas Separator | U12 | Level #1 of Absorption Column |
| U3 | Pressure Difference of Feed Gas Separator | U13 | Pressure of Process Gas to Absorption Column |
| U4 | Pressure of Feed NG | U14 | Level #2 of Absorption Column |
| U5 | Temperature of Feed NG | U15 | Temperature in the Middle of Absorption Column |
| U6 | Level of Process Gas Separator | U16 | Level #3 of Absorption Column |
| U7 | Pressure Difference of Absorption Column | U17 | Pressure of Process Gas at the Top of Absorption Column |
| U8 | Pressure of Feed Gas Separator | U18 | Temperature of Amine Liquor to Absorption Column |
| U9 | Temperature of Process Gas Separator | U19 | Temperature of Process Gas at the Top of Absorption Column |
| U10 | Pressure of Process Gas Separator | U20 | Level of Regeneration Column |
| Y | Content of Residual CO2 in the Process Gas | ||
| Hyper-Parameters | Value | Tags | Descriptions |
|---|---|---|---|
| Graph-Embedding Dim | 10 | Channels | 3, 3 |
| Node-Embedding Dim | 6 | Convolution Kernels | 3, 3 |
| Heads | 4 | Convolution Stride | 1, 1 |
| Nodes | 20 | Pooling Kernels | 2, 2 |
| Learning Rate | 0.001 | Epoch Num. | 200 |
| Batch Size | 256 | Optimizer | Adam |
| Index\ Models | VSL-STGCNv1 | VSL-STGCNv2 | VSL-STGCNv3 | STGCN | GCN | LSTM |
|---|---|---|---|---|---|---|
| Num. of Paras. | 14,442 | 11,507 | 8423 | 5898 | 1135 | 697 |
| RMSE | 0.0261 (±0.0012) | 0.0195 (±0.0010) | 0.0163 (±0.0009) | 0.0305 (±0.0021) | 0.0384 (±0.0026) | 0.0391 (±0.0016) |
| MAE | 0.0202 (±0.0007) | 0.0149 (±0.0006) | 0.0127 (±0.0005) | 0.0256 (±0.0017) | 0.0291 (±0.0022) | 0.0312 (±0.0011) |
| R2 | 0.9567 (±0.0025) | 0.9757 (±0.0020) | 0.9831 (±0.0018) | 0.9489 (±0.0028) | 0.9312 (±0.0032) | 0.9265 (±0.0019) |
| Training Time (s) | 636 (±8) | 594 (±6) | 429 (±5) | 336 (±5) | 248 (±3) | 82 (±3) |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Tang, S.; Zhu, Z.; Zhou, Y.; Shen, B.; Shen, Z.; Yang, Z.; Yao, L. Variable Structure Learning-Based Spatio-Temporal Graph Convolutional Networks for Chemical Process Quality Prediction with SHAP-Enhanced Interpretability. Processes 2025, 13, 3751. https://doi.org/10.3390/pr13113751
Tang S, Zhu Z, Zhou Y, Shen B, Shen Z, Yang Z, Yao L. Variable Structure Learning-Based Spatio-Temporal Graph Convolutional Networks for Chemical Process Quality Prediction with SHAP-Enhanced Interpretability. Processes. 2025; 13(11):3751. https://doi.org/10.3390/pr13113751
Chicago/Turabian StyleTang, Siyuan, Zheren Zhu, Yuanqiang Zhou, Bingbing Shen, Ziyan Shen, Zeyu Yang, and Le Yao. 2025. "Variable Structure Learning-Based Spatio-Temporal Graph Convolutional Networks for Chemical Process Quality Prediction with SHAP-Enhanced Interpretability" Processes 13, no. 11: 3751. https://doi.org/10.3390/pr13113751
APA StyleTang, S., Zhu, Z., Zhou, Y., Shen, B., Shen, Z., Yang, Z., & Yao, L. (2025). Variable Structure Learning-Based Spatio-Temporal Graph Convolutional Networks for Chemical Process Quality Prediction with SHAP-Enhanced Interpretability. Processes, 13(11), 3751. https://doi.org/10.3390/pr13113751

