1. Introduction
Large facilities, such as offices, schools, hospitals, and university campuses, serve as critical testbeds for the deployment of digitized, flexible, and low-carbon energy systems. These environments are characterized by complex system configurations, intricate spatial layouts, and nonlinear thermo-energy dynamics. Moreover, they exhibit diverse operational patterns and stochastic interactions between occupants and building systems.
The paradigm shift toward intelligent interconnected management reveals the inadequacy of conventional methods, highlighting the urgent need for predictive tools that are scalable to district and urban operations [
1,
2]. At such scales, ensuring data quality through the coordinated management of heterogeneous sources is paramount. The recent literature [
3,
4] addresses these challenges via automated methodologies leveraging building archetypes, temporal clustering, and physics-guided structures with data-driven parameter estimation (grey-box modeling). These approaches effectively abstract building heterogeneity into highly representative synthetic models. The integration of such models with urban DT platforms facilitates neighborhood- and urban-level simulation and dynamic management through continuous training, validation, and real-time data assimilation. This evolution marks a departure from building-centric modeling toward a systemic approach capable of supporting complex control and optimization strategies at scale.
The methodological soundness of the RC approach is evidenced by its incorporation into pivotal reference standards, most notably ISO 13790 [
5] and ISO 52016-1 [
6]. Both standards explicitly employ RC-based frameworks for the assessment of energy needs and thermal loads. Furthermore, the ASHRAE guidelines [
7] acknowledge the efficacy of RC models for dynamic building performance analysis. Interest in this modeling class has been revitalized by its synergy with automated identification methods, advanced predictive control strategies, and artificial intelligence (AI) algorithms [
8,
9].
Figure 1 illustrates a classical 2R2C schema, detailing both the continuous formulation and the discrete form employed for identification and predictive control. This compact interpretable formalism serves as the foundation for the multi-zone extensions and DT integrations addressed later.
Despite their consolidation in ISO/ASHRAE workflows, open research questions remain: (i) The accuracy–interpretability trade-off: While RC models offer transparency and computational efficiency, deep data-driven models may yield superior accuracy when high-quality data is abundant. (ii) Scalability and generalizability: The validity of archetype- and clustering-based RC models across heterogeneous climates, typologies, and operational profiles is not yet fully proven. (iii) Technology readiness: The operational maturity of DT and edge implementations varies significantly, with the majority of studies limited to the prototyping stage. (iv) Comparative analysis: Benchmarking against data-driven baselines is frequently hindered by disparate data requirements and heterogeneous KPIs, preventing like-for-like comparisons.
Large-scale analysis of building facilities often faces a lack of high-quality data, which traditionally must be collected through on-site surveys, an operation that is not always feasible and often insufficient for the required analysis. In recent years, the use of datasets from Internet of Things (IoT) technologies and digital infrastructures has overcome some of the historical barriers related to data collection, enabling large-scale identification and calibration procedures for RC models [
8]. While the literature has extensively demonstrated the strengths of RC models for large-scale energy simulation [
3,
11], a new research frontier is rapidly emerging. This involves applying these models to advanced control scenarios, energy flexibility, demand response (DR) automation, and DT integration, often using real-world datasets from facilities and districts characterized by high heterogeneity in properties and operating conditions [
12,
13,
14].
High-fidelity physical models, known as ’white-box’ (e.g., BEM), represent the state of the art in simulation reliability. However, especially for large and complex buildings, the difficulty of acquiring detailed data and the presence of unpredictable variables severely limit their large-scale applicability [
1,
11]. On the opposite side of the spectrum, “black-box” or purely data-driven models often lack interpretability and transparency, creating a barrier for administrators or policymakers who require clear insights into the drivers of energy consumption [
5,
12]. Therefore, recent applications underscore the main challenge of combining the computational efficiency of lumped parameter models with the accuracy required for predictive management and control, particularly in real-world contexts where data quality and availability are inconsistent [
15].
Previous reviews, such as the study by Serasinghe et al. [
16], provide a critical assessment of parameter-identification methodologies for low-order models like RC models. Their research offers an in-depth evaluation of the current methods, explaining both the advantages and limitations of parameter estimation and serving as a fundamental reference in the field. Yang et al. [
3] currently represent the main reference for grey-box RC modeling at the urban and district scales, examining different model structures, identification techniques, and urban applications. While this review provides useful information on the pros, cons, and scalability of grey-box RC models, it does not explicitly address how the proposed framework can be integrated into current DT platforms or how IoT data and smart thermostats can support it. Finally, the valuable work by Ma et al. [
14] provides an updated overview of the integration between physical models (such as RC) and new ML techniques. Although the paper thoroughly analyzes hybrid approach opportunities, including physics-informed neural networks (PINNs), it focuses more on algorithmic potential than on operational and automated large-scale implementations.
Overall, these valuable works tend to leave crucial aspects partially uncovered, such as standardization, real interoperability between DT ecosystems, large-scale automation in urban districts, and the standardization of KPIs based on real operational data. The integration of urban DT platforms with real-time feedback for continuous parameter updates represents a primary research and innovation direction for energy management in large facilities and districts. Combining grey-box modeling with ML techniques enables advanced adaptive control and anomaly detection systems capable of operating across extensive infrastructures with reduced human intervention, improved resilience, and decentralized management. Furthermore, the application of grey-box models at the intersection of building physics and digital markets for energy flexibility warrants further research as it holds great promise for energy communities, aggregators, and smart districts. These models enable active participation in DR, collective optimization, and energy flexibility services at an urban scale owing to their interpretability and computational compactness.
To clearly distinguish the novel contribution of this work,
Table 1 provides a detailed comparison with existing reviews, highlighting the specific gaps addressed regarding interoperability and operational deployment.
This review therefore provides a critical and operational synthesis of recent RC model applications for energy management approaches in large buildings and districts and urban clusters. The originality of this review lies in four main points:
Comparative analysis and operational benchmarking;
Open-source automation and workflow focus;
Real integration with DT and smart city platforms;
Critical summary and actionable recommendations.
Ultimately, this study offers a clear and practical framework for developers, operators, and users managing building energy infrastructure, providing them with actionable insights for potential implementations and system optimization.
4. Discussion and Future Research Perspectives
This review extends beyond traditional methodological identification to examine the end-to-end operation of RC grey-box models in real-world applications, ranging from data processing to decision-making. The operational pathway depicted in
Figure 6 illustrates this entire process, tracing the flow from heterogeneous data streams to KPI feedback that closes the loop for continuous improvement. Specifically, the pipeline consists of six key stages: (1) data collection and integration (BMS/IoT, weather/occupancy, and BIM/IFC/CityGML/ Brick); (2) automatic/semi-automatic RC model generation (lumped, distributed, multi-zone, and archetypes) mapped to semantic IDs; (3) automatic calibration and parameter optimization, providing accuracy and uncertainty estimates; (4) predictive control (MPC/DR/AI) deployed on edge or cloud infrastructure; (5) DT integration for real-time simulation and supervision; and (6) aggregated KPIs (energy savings, RMSE/MAE, flexibility, peak shaving, and CO
2), which close the loop by triggering data-quality flags, re-calibration, and control retuning.
This visual anchor serves as a reference point to connect surveyed contributions with specific pipeline stages, effectively demonstrating both mature practices (such as MPC/DR based on RC models) and remaining gaps (such as DT feedback updates, standardized KPI sets, and building-to-district interoperability).
The discussion follows a structured approach to examine three main themes: (i) scalable automated model generation and calibration via archetypes, clustering, and batch/online identification; (ii) predictive control operational robustness under uncertainty, addressing data quality, forecasting, and constraint handling; and (iii) DT-based KPI-driven governance to enable model-data-control co-evolution. This structure aims to clarify which solutions are ready for widespread adoption and which require standardized protocols, tools, and validation across multiple building types and climate zones.
4.1. Critical Analysis: Failure Modes and the Accuracy–Interpretability Trade-Off
While RC models represent a robust compromise between computational efficiency and physical consistency, the literature delineates specific operational boundaries where their efficacy is surpassed by purely data-driven (black-box) or high-fidelity (white-box) paradigms. This performance trade-off is fundamentally dictated by the interplay between data availability and system complexity. The primary limitation of RC models stems from their structural constraints; in data-rich environments, the lumped-parameter simplification may lack the expressiveness required to capture complex nonlinear dynamics, such as anisotropic solar gains or stochastic zonal interactions. For instance, Cui et al. [
62] demonstrate that, while RC models offer stability, they produce a generalized response with a total RMSE of 7.44 °C. In contrast, recurrent neural networks (RNNs) capitalize on the extensive dataset to achieve a significantly superior RMSE of 5.59 °C. Similarly, Di Natale et al. [
25] quantify this gap, revealing that PCNNs reduced MAE by approximately 35% compared to linear grey-box models (1.17 °C vs. 1.79 °C), thereby exposing the inherent inability of standard RC structures to resolve high-order nonlinearities. Conversely, the advantage of black-box models collapses in data-sparse scenarios or when extrapolation beyond the training domain is necessary. Kong et al. [
22] identify a critical empirical threshold: with limited training samples (e.g., 7 days), physics-based RC models significantly outperformed data-driven algorithms (e.g., XGBoost), which suffered from poor generalization. The asymptotic superiority of the data-driven approach only emerged once the dataset exceeded 35 days. Furthermore, Ma et al. [
14] emphasize that pure black-box models are susceptible to violating physical laws, such as predicting temperature drops during active heating, errors that are precluded by the thermodynamic constraints embedded within RC networks.
However, RC models are not immune to failure, even when the structure is sound. A critical vulnerability is the “data-rich but information-poor” paradox described by Serasinghe et al. [
16], where operational data lacks the necessary thermal excitation to uniquely identify resistance (R) and capacitance (C) parameters. To mitigate ill-posed identification, Vallianos et al. [
8] establish that a training period of 7 to 14 days is optimal for residential buildings, noting that shorter datasets lead to overfitting while longer ones offer diminishing returns. Additionally, neglecting sensor dynamics can compromise physical plausibility; Yu et al. [
33] show that ignoring the thermal inertia of wall-mounted sensors in stochastic models leads to erroneous estimations of heat transfer coefficients.
Ultimately, the choice of modeling strategy depends on specific operational constraints. While hybrid approaches combining RC structures with ML show promise in reducing prediction errors by up to 34.5% (Kong et al. [
22]), the traditional RC model remains the superior choice when training data is scarce (<2 weeks) or for strict physical consistency and interpretability. Interpretability is required for control stability (Di Natale et al. [
25]).
To corroborate these findings and quantify the trade-off between predictive fidelity and computational overhead,
Table 6 presents a comparative benchmarking of RC models against white-box and black-box paradigms. Empirical evidence underscores a distinct divergence: while RC models offer a dramatic reduction in computational runtime relative to white-box simulations (e.g., ∼80% reduction [
1]), they may succumb to higher prediction errors compared to advanced black-box architectures in data-rich environments (e.g., RMSE 7.44 °C vs. 5.59 °C [
62]).
4.2. Synthesis of Current Research Limitations
Transcending individual performance metrics, a holistic synthesis of the literature exposes three systemic impediments currently constraining the ubiquitous deployment of RC models.
4.2.1. The “Data-Rich, Information-Poor” Paradox
A pervasive challenge in data-driven modeling is that dataset volume does not equate to parameter identifiability. Serasinghe et al. [
16] emphasize that operational data from modern BMS often lacks sufficient spectral excitation as feedback controllers actively suppress the temperature fluctuations required for identification. In such closed-loop scenarios, RC parameters frequently converge to physically implausible values. To mitigate this ill-posedness, Vallianos et al. [
8] empirically establish a narrow optimal calibration window: datasets shorter than 7 days result in significant overfitting, while extending beyond 14 days yields diminishing returns in accuracy. Moreover, Yu et al. [
33] demonstrate that overlooking sensor dynamics, specifically the thermal inertia of sensor encapsulation, introduces a systematic bias in stochastic parameter estimation.
4.2.2. Structural and Stochastic Inflexibility
Standard RC models exhibit limited generalization capabilities in environments dominated by nonlinear dynamics or stochastic behavior. Di Natale et al. [
25] and Cui et al. [
62] demonstrate that grey-box structures fail to capture complex solar gain interactions without significant manual engineering. Moreover, Tugores et al. [
29] emphasize that deterministic RC baselines cannot adequately model stochastic variables, such as occupant-driven natural ventilation and metabolic rates, leading to significant prediction errors in real-world educational buildings. While hybrid solutions like PINNs aim to bridge this gap, Chen et al. [
59] note that they introduce new challenges regarding the complex balancing of physical loss terms against data-driven loss terms during training.
4.2.3. Fragmentation of Validation Frameworks
Finally, the lack of standardized validation protocols hinders cross-study comparisons. As noted by Shamsi et al. [
15], most studies rely on deterministic error metrics (RMSE and MAPE) which fail to capture the probabilistic nature of thermal flexibility. Furthermore, the semantic disconnect between physical models and digital platforms remains a significant barrier; Bjørnskov et al. [
57] emphasize that, without unified ontologies (such as Brick or SAREF), mapping model variables to sensors remains a manual bottleneck. At the urban scale, this issue is compounded by the geometric misalignment between static GIS data and dynamic thermal requirements, often necessitating computationally expensive ray-tracing workarounds [
30].
4.3. Deep Analysis of ML Integration Paths and Core Challenges
To move beyond a generic overview of ML adoption, it is necessary to dissect the specific technical architectures that currently define the state of the art in hybrid RC modeling. The literature review identifies three distinct integration paths, each with unique advantages and trade-offs.
4.3.1. Technical Integration Paths
Series Integration (Residual Modeling): This approach utilizes the RC model as a physical baseline and employs ML algorithms to predict the error term (residuals). Kong et al. [
22] successfully apply this method by using an XGBoost model to compensate for the RC model’s inability to capture unmeasured internal heat gains. The primary advantage is that the ML module only needs to learn the nonlinear deviation, reducing the data requirement compared to pure black-box approaches.
Soft-Constrained Integration (Standard PINNs): In this architecture, physical laws are embedded directly into the loss function of a neural network as penalty terms (soft constraints). As reviewed by Ma et al. [
14], this involves minimizing a composite loss function:
. Chen et al. [
59] demonstrate this path for DR control, where RC equations constrain the search space of the neural network, promoting physical consistency without altering the network structure.
Hard-Constrained Integration (Architectural Embedding): A more robust path involves embedding physical laws into the neural network topology itself (hard constraints). Di Natale et al. [
25] propose PCNNs, where a parallel physics-inspired module enforces thermodynamic laws (e.g., positive heat transfer) by design. Unlike soft-constrained PINNs, this architecture guarantees physical consistency even if training converges to a sub-optimal local minimum.
4.3.2. Critical Implementation Barriers
Notwithstanding these methodological strides, two fundamental challenges persist, creating barriers to robust deployment:
The Weighting Factor Dilemma (Loss Balancing): In Physics-Regularized PINNs, the calibration of the weighting factor (
) governing the trade-off between empirical error (
) and physical consistency (
) presents a non-trivial multi-objective optimization problem. Chen et al. [
59] caution that an ill-conditioned
leads to gradient pathologies: a low
results in the violation of physical laws, while an excessively high
causes the physics term to dominate the gradient, preventing convergence on the measurement data. This dependency induces a computationally expensive hyperparameter tuning bottleneck.
The Expressiveness–Robustness Trade-Off: While augmenting RC models with neural components enhances expressiveness, it concomitantly reintroduces the risk of overfitting. Ma et al. [
14] observe that, while hybrid architectures excel at “interpolation” (within the training distribution), their capacity for “extrapolation” to unseen climatic conditions (distributional shift) degrades significantly if the ML component is over-parameterized. In such cases, the hybrid model loses its physical grounding and inherits the robustness limitations typical of pure black-box approaches.
4.4. Predictive Control and Flexibility: A Concrete but Fragmented Innovation
RC models are now well established for the implementation of MPC strategies and for the management of energy flexibility in buildings [
31,
52,
67]. Real-world deployments report tangible benefits: tube-based MPC reduced HVAC costs by up to 24% [
46], while DMPC achieved thermal energy savings of 41% [
31]. Han et al. [
68] propose probabilistic KPIs to measure operational flexibility, allowing for the real-time estimation of a building’s capacity to modulate its load. Meanwhile, Morovat et al. [
53] show that thermal inertia enables shifts of ∼
in daily usage patterns. From an economic valuation perspective, Sun et al. [
36] propose adaptive models capable of responding to external price signals, thereby increasing the potential for participation in DR programs. Despite this promising progress, the simultaneous adoption of the three fundamental dimensions of advanced control, namely robust optimization, real-time implementation, and distributed architectures, remains limited. Evolutionary optimization techniques (e.g., PSO) can further enhance MPC performance [
39]. However, as summarized in
Table 7, while strategies such as MPC and DR are widely present (over 80% of cases), truly distributed solutions (e.g., edge computing and peer-to-peer coordination) are largely absent from the examined case studies, mentioned at most as future perspectives. Similarly, most DT applications remain hybrid or conceptual, while fully operational implementations are rare and still in the experimental phase, as exemplified by the study in [
57]. This analysis reveals a major discrepancy: RC solutions have reached theoretical maturity, yet many implementation systems remain underdeveloped. Building scalable distributed energy control systems will require lightweight RC models that (i) interoperate through semantic workflows (e.g., Brick and IFC), (ii) support online parameter identification techniques [
69], and (iii) operate under intelligent supervisory control.
The subsequent operational benchmark in
Table 7 analyzes the presence of key features in recent advanced control research. The codes are defined by level of implementation (e.g., direct use, simulation, and potential) and specific feature. The core codes are standardized across all columns: Y (yes) signifies direct implementation or fulfillment of the column’s core requirement; S (simulation) indicates that the feature was validated only in a simulation environment (applicable for MPC and RT); and N (no) means that the topic or feature is entirely absent. A specific feature code captures variants and nuances within each dimension: MPC variants include ‘PA’ (potential application), ‘E’ (enablement/citation but no implementation), and ‘P’ (predictive but not traditional MPC logic). DR (demand response) variants include ‘LS’ (load shifting), ‘A’ (adaptability/suitability), ‘F’ (future application), and ‘NP’ (non-pure demand response). DT uses ‘H’ (hybrid simulator) for non-actual DT models. RT (real-time control) variants include ‘QA’ (quasi-real-time updates), ‘RP’ (real-time prediction without closed-loop control), and ‘PU’ (potential real-time use without demonstration). Finally, SUP (smart/intelligent supervision) variants include ‘SA’ (scenario analysis), ‘AA’ (automatic adaptation), ‘MR’ (monitoring with recommendations), and ‘AP’ (advanced disturbance prediction).
4.5. Operational Implications for Planners and Managers
Evaluated RC models produce specific operational outcomes that affect various stakeholders involved in building energy digitalization, including retrofit contractors, DT system developers, and smart city solution providers. BMS designers benefit from RC models that adopt semantic standards, such as SAREF or Brick, enabling the creation of automated interpretable energy models based on architectural and plant descriptions [
77]. The Modelica environment supports large-scale DTs through interoperable systems. The literature [
57] demonstrates temperature prediction errors below 0.4 °C and full compatibility with BMS infrastructures. Combining pre-calibrated RC archetypes with thermal clustering techniques allows for quick and dependable simulations for large retrofit projects, even when detailed geometric data is unavailable [
51,
78]. The works of Mugnini et al. [
40] and Giuzio et al. [
1] show that thousands of buildings can be represented with >
accuracy and computational time reductions of up to 80%, thus offering effective tools for aggregate assessments of urban energy flexibility and massive-scale energy audits. The deployment of simplified RC models (e.g., 2R2C) on Raspberry Pi or comparable industrial controllers within smart decentralized environments enables local MPC strategies. Reported implementations ([
46,
50]) show the potential to enhance passive cooling by 25% and reduce HVAC energy costs by 24% while requiring minimal field intervention and enabling quick adaptation to environmental variations. The upcoming generation of DT systems will provide new capabilities through the combination of RC models and AI techniques, including PINNs. The recent literature [
23] demonstrates a hybrid RC + CNN–LSTM framework capable of dynamically correcting residual errors. Similarly, Odendaal et al. [
79] present a hybrid approach combining RC models with a CNN–LSTM architecture for real-time error correction, while Cui et al. [
62] prove that RC models can function as virtual sensors for FDD, thereby eliminating the need for physical sensors. The operational implications summarized in
Table 8 provide designers, energy managers, and developers with practical hints for selecting and implementing RC models in real-world contexts.
4.6. Operational Recommendations and Future Perspectives
This study set out to provide operational recommendations for the adoption of RC models as energy management tools in large-scale building facilities. The review led to the following aspects:
Automatic generation and interoperability:
The integration of RC-based grey-box models into component/model libraries through semantic ontologies and schemas (e.g., SAREF, IFC, and Brick) enables faster development of dependable models that can directly extract data from DT [
80]. The implementation of semantics reduces modeling effort and improves interoperability with existing BMS [
57].
Urban scalability:
The use of pre-calibrated archetypes and thermal-clustering techniques for large-scale building clusters enables the evaluation of entire building stocks with higher accuracy [
81] and drastic reduction in computational time.
Decentralized predictive control:
The implementation of simplified RC models as an internal computational core on edge devices (e.g., Raspberry Pi) enables local execution of predictive control strategies, with measurable reductions in HVAC energy consumption and rapid response to environmental changes.
Robustness, adaptability, and intelligent diagnostics:
RC models hybridized with AI techniques, such as DRL, PINNs, and CNN–LSTM architectures, show strong potential to improve robustness, automate parameter updates, and enable fault detection without additional sensors [
82,
83].
Energy coordination at district level:
At the microgrid or urban district level, RC models enable the integrated management of HVAC, PV, and batteries, improving self-consumption and energy flexibility [
84,
85].
Challenges and Future Perspectives: Despite the progress detailed in this review, several key challenges remain:
Standardizing evaluation metrics (especially probabilistic KPIs) to ensure true comparability across studies;
Validating virtual sensors in buildings characterized by limited data infrastructure;
Developing adaptive multi-zone RC models that are tested and validated at scale;
Strengthening interoperability between diverse software environments and ensuring native integration with edge architectures and distributed AI.
Recent experiences confirm that RC models currently represent a strategic bridge between physics and data. Their continued evolution will be crucial for enabling resilient and digitized energy management in future urban districts.
5. Conclusions
This review analyzed the state of the art and the current and future perspectives of RC models applied to modeling and energy control in complex large-scale building facilities. The reviewed studies demonstrate that these models offer a practical way to combine predictive accuracy, physical interpretability, and ease of implementation in energy management tools. Their compact structure is well-suited for real-time control, thermal flexibility estimation, and the automatic generation of energy DTs for districts and building clusters.
However, the review also underlines critical issues: data quality and granularity remain fundamental constraints. RC models built on noisy datasets tend to perform worse than black-box models, especially in highly variable contexts. Moreover, their interpretability-oriented structure can limit their ability to capture complex nonlinear dynamics unless they are augmented by hybrid data-driven methods.
The emerging trends include the following:
The adoption of hybrid RC+AI approaches, such as PINNs, which combine generalization with robustness.
The use of RC models as “virtual sensors” to detect faults or perform continuous commissioning.
Integration with DTs via semantic ontologies and edge deployments for distributed control.
RC models enable practical applications that facilitate participation in DR programs and the coordination of energy flexibility across energy communities and urban districts. Research demonstrates that RC modeling can be scaled to thousands of buildings through clustering and model-order reduction, maintaining acceptable error margins with low computational costs.
Future research and development should converge on three strategic pillars:
Automation and Scalability: The advancement of fully automated data-driven modeling workflows leveraged by archetype-based clustering to ensure replicability at scale.
Robustness and Adaptability: The hybridization of physical models with interpretable AI architectures, supported by rigorous uncertainty quantification frameworks.
Interoperability and Integration: The deep embedding of RC models within digital ecosystems and holistic distributed energy management architectures.
In conclusion, RC models stand as pivotal enablers for the transition toward resilient smart built environments. By unlocking predictive control and quantifying energy flexibility, they effectively bridge the gap between physical infrastructure and digital intelligence. Strategic investments in building digitalization, coupled with the inherent computational efficiency of grey-box RC models, are poised to accelerate the decarbonization of large-scale building portfolios.