Next Article in Journal
Research on the Alternating Current Properties of Cellulose–Innovative Bio-Oil Nanocomposite as the Fundamental Component of Power Transformer Insulation—Determination of Nanodroplet Dimensions and the Distances Between Them
Previous Article in Journal
Analysis and Evaluation of the Operating Profile of a DC Inverter in a PV Plant
Previous Article in Special Issue
Multi-Scale Predictive Modeling of RTPV Penetration in EU Urban Contexts and Energy Storage Optimization
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
This is an early access version, the complete PDF, HTML, and XML versions will be available soon.
Article

A Physics-Informed Reinforcement Learning Framework for HVAC Optimization: Thermodynamically-Constrained Deep Deterministic Policy Gradients with Simulation-Based Validation

1
Faculty of Civil and Industrial Engineering, Sapienza University of Rome, 00184 Rome, Italy
2
Nuclear Department, ENEA, 40121 Bologna, Italy
*
Authors to whom correspondence should be addressed.
Energies 2025, 18(23), 6310; https://doi.org/10.3390/en18236310 (registering DOI)
Submission received: 31 October 2025 / Revised: 16 November 2025 / Accepted: 25 November 2025 / Published: 30 November 2025
(This article belongs to the Special Issue New Insights into Hybrid Renewable Energy Systems in Buildings)

Abstract

This paper presents a physics-informed reinforcement learning framework that embeds thermodynamic constraints directly into the policy network of a continuous control agent for HVAC optimization. We introduce a Thermodynamically-Constrained Deep Deterministic Policy Gradient (TC-DDPG) algorithm that operates on continuous actions and enforces physical feasibility through a differentiable constraint layer coupled with physics-regularized loss functions. In a simulation-based evaluation using a custom Python multi-zone resistance-capacitance (RC) thermal model, the proposed method achieves a 34.7% reduction in annual HVAC electricity consumption relative to a rule-based baseline (95% CI: 31.2–38.1%, n = 50 runs) and outperforms standard DDPG by 16.1 percentage points. Thermal comfort during occupied hours maintains PMV∈ [−0.5, 0.5] for 98.3% of operational time, peak demand decreases by 35.8%, and simulated coefficient of performance (COP) improves from 2.87 ± 0.08 to 4.12 ± 0.10. Physics constraint violations are reduced by approximately 98.6% compared to unconstrained DDPG, demonstrating the effectiveness of architectural enforcement mechanisms within the simulation environment. We present a reference prototype and commit to a future public release of the code, configurations, and hyperparameters sufficient to reproduce the reported results. The paper explicitly addresses the limitations of simulation-based studies and presents a staged roadmap toward hardware-in-the-loop testing and pilot deployments in real buildings.
Keywords: physics-informed reinforcement learning; TC-DDPG; continuous control; HVAC optimization; thermodynamic constraints; building energy management; simulation validation physics-informed reinforcement learning; TC-DDPG; continuous control; HVAC optimization; thermodynamic constraints; building energy management; simulation validation

Share and Cite

MDPI and ACS Style

Hedayat, S.; Ziarati, T.; Manganelli, M. A Physics-Informed Reinforcement Learning Framework for HVAC Optimization: Thermodynamically-Constrained Deep Deterministic Policy Gradients with Simulation-Based Validation. Energies 2025, 18, 6310. https://doi.org/10.3390/en18236310

AMA Style

Hedayat S, Ziarati T, Manganelli M. A Physics-Informed Reinforcement Learning Framework for HVAC Optimization: Thermodynamically-Constrained Deep Deterministic Policy Gradients with Simulation-Based Validation. Energies. 2025; 18(23):6310. https://doi.org/10.3390/en18236310

Chicago/Turabian Style

Hedayat, Sattar, Tina Ziarati, and Matteo Manganelli. 2025. "A Physics-Informed Reinforcement Learning Framework for HVAC Optimization: Thermodynamically-Constrained Deep Deterministic Policy Gradients with Simulation-Based Validation" Energies 18, no. 23: 6310. https://doi.org/10.3390/en18236310

APA Style

Hedayat, S., Ziarati, T., & Manganelli, M. (2025). A Physics-Informed Reinforcement Learning Framework for HVAC Optimization: Thermodynamically-Constrained Deep Deterministic Policy Gradients with Simulation-Based Validation. Energies, 18(23), 6310. https://doi.org/10.3390/en18236310

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop