Next Article in Journal
Mathematical Simulation and Optimization of the Industrial Methanol-to-Olefins Process Based on Measured Plant Data
Previous Article in Journal
Performance Analysis and Temperature-Corrected Core Loss Modeling of Soft Magnetic Materials for Hybrid Stepper Motors in Cryogenic Environments
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
This is an early access version, the complete PDF, HTML, and XML versions will be available soon.
Article

Suppressing High-Frequency Action Noise in DRL-Based Process Control: A Dual Strategy for Thermal Regeneration Column

State Key Laboratory of Materials-Oriented Chemical Engineering, College of Chemical Engineering, Jiangsu National Synergetic Innovation Center for Advanced Materials, Jiangsu Collaborative Innovation Center for Advanced Inorganic Function Composites, Nanjing Tech University, Nanjing 211816, China
*
Author to whom correspondence should be addressed.
Processes 2026, 14(10), 1598; https://doi.org/10.3390/pr14101598
Submission received: 17 April 2026 / Revised: 8 May 2026 / Accepted: 12 May 2026 / Published: 14 May 2026

Abstract

Stochastic policy reinforcement learning (RL) algorithms are widely used in industrial control due to their strong exploration ability and high sample efficiency. However, these algorithms often produce large action fluctuations and noise, making them unsuitable for steady-state chemical processes. To solve this problem, this study uses a thermal regeneration column (TRC) as the research object and selects the Soft Actor-Critic (SAC) algorithm as the baseline. Three strategies are introduced to improve the SAC algorithm: an action-amplitude-constrained reward function, a low-pass filter, and a Kalman filter. Experimental results show that the combination of the action-amplitude-constrained reward function and the Kalman filter achieves the best performance. Compared with the traditional SAC algorithm, the fluctuation amplitudes of steam consumption, cooling water consumption, sulfur concentration and methanol makeup rate are reduced by 85.50%, 82.81%, 90.84% and 85.49%, respectively. In addition, the fluctuation amplitude of the reward function decreases by 90.68%. This method not only optimizes operating costs but also ensures the stable operation of the TRC.
Keywords: deep reinforcement learning; Soft Actor-Critic; TRC; Kalman filter; action constraint; stable control deep reinforcement learning; Soft Actor-Critic; TRC; Kalman filter; action constraint; stable control

Share and Cite

MDPI and ACS Style

Si, S.; Pan, J.; Wan, H.; Guan, G. Suppressing High-Frequency Action Noise in DRL-Based Process Control: A Dual Strategy for Thermal Regeneration Column. Processes 2026, 14, 1598. https://doi.org/10.3390/pr14101598

AMA Style

Si S, Pan J, Wan H, Guan G. Suppressing High-Frequency Action Noise in DRL-Based Process Control: A Dual Strategy for Thermal Regeneration Column. Processes. 2026; 14(10):1598. https://doi.org/10.3390/pr14101598

Chicago/Turabian Style

Si, Shuaoyun, Jincheng Pan, Hui Wan, and Guofeng Guan. 2026. "Suppressing High-Frequency Action Noise in DRL-Based Process Control: A Dual Strategy for Thermal Regeneration Column" Processes 14, no. 10: 1598. https://doi.org/10.3390/pr14101598

APA Style

Si, S., Pan, J., Wan, H., & Guan, G. (2026). Suppressing High-Frequency Action Noise in DRL-Based Process Control: A Dual Strategy for Thermal Regeneration Column. Processes, 14(10), 1598. https://doi.org/10.3390/pr14101598

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop