- Article
Modeling for Data Efficiency: System Identification as a Precursor to Reinforcement Learning for Nonlinear Systems
- Nusrat Farheen,
- Golam Gause Jaman and
- Marco P. Schoen
Safe and sample-conscious controller synthesis for nonlinear dynamics benefits from reinforcement learning that exploits an explicit plant model. A nonlinear mass–spring–damper with hardening effects and hard stops is studied, and off-plant Q-learning is enabled using two data-driven surrogates: (i) a piecewise linear model assembled from operating region transfer function estimates and blended by triangular memberships and (ii) a global nonlinear autoregressive model with exogenous input constructed from past inputs and outputs. In unit step reference tracking on the true plant, the piecewise linear route yields lower error and reduced steady-state bias (MAE = ; SSE = ) compared with the NLARX route (MAE = ; SSE = ) in the reported configuration. The improved regulation is obtained at a higher identification cost (60,000 samples versus 12,000 samples), reflecting a fidelity–knowledge–data trade-off between localized linearization and global nonlinear regression. All reported performance metrics correspond to deterministic validation runs using fixed surrogate models and trained policies and are intended to support methodological comparison rather than statistical performance characterization. These results indicate that model-based Q-learning with identified surrogates enables off-plant policy training while containing experimental risk and that performance depends on modeling choices, state discretization, and reward shaping.
30 January 2026








