5.1. Simulation Environment and Parameter Configuration
To validate the effectiveness of the coordinated scheduling method for the Hospital Integrated Energy System (HIES), simulations were conducted based on a real-world energy system from a tertiary hospital. The experiments were implemented on a computing platform equipped with an AMD Ryzen 7 5800 8-core processor, manufactured by Advanced Micro Devices (AMD), Santa Clara, CA, USA.
The specific parameter values and operational boundaries are detailed in
Table 2. The HIES consists of the following core components: a 2000 kW rooftop photovoltaic (PV) system, a 1500 kW diesel generator, and a 3600 kWh hybrid energy storage system (comprising lithium-ion batteries and flywheels). The hospital comprises 800 beds and includes several critical load zones such as the surgical complex, intensive care unit (ICU), emergency department, and general wards. Surgical Complex: Dual-loop power supply with reliability ≥ 99.99%. ICU: Uninterruptible power supply (UPS) with seamless switching (response time ≤ 0.1 s). General Wards: Tiered power supply protection according to priority.
To ensure reproducibility and transparency, the detailed configuration of algorithmic, optimization, and system parameters used in the simulations is summarized in
Table 3.
To represent monthly variability with a tractable number of intraday profiles, we construct typical days for each month and use their occurrence probabilities in the short-term MILP objective (see the use of typical-day scenarios and ).
- (1)
Data sources and sampling.
Hourly (1 h) series are used in the MILP layer for consistency with the short-term scheduling horizon. Raw telemetry is collected at 15-min resolution and aggregated to hourly averages (power/energy) or maxima (peaks):
Hospital load (by zones: OR, ICU, wards, etc.): EMS/BMS/SCADA metering.
PV generation: inverter logs from the 2 MW rooftop PV plant.
Meteorology: local meteorological station (ambient temperature, global horizontal irradiance, wind speed).
Tariff: TOU prices from the local utility used in
Section 5 (baseline 0.42–0.68 CNY/kWh).
All timestamps are aligned to the hospital’s local time; missing values (<0.5%) are imputed via seasonal neighbor interpolation; outliers are filtered by the IQR rule.
- (2)
Feature engineering.
For each calendar day we form a feature vector that concatenates: (i) normalized total-load and zone-load intraday profiles; (ii) PV output profile; (iii) meteorology profiles (temperature, irradiance); and (iv) calendar indicators (weekday/holiday). Features are standardized: . To avoid collinearity, PCA is applied and the first components explaining ≥95% variance are retained.
- (3)
Clustering and number of clusters.
Clustering is performed per month to capture seasonal structure while preserving comparability across months:
Algorithm: K-means with k-means++ initialization; Euclidean distance; 50 restarts.
Model selection: the number of clusters is chosen by maximizing the silhouette (and cross-checked by the Calinski–Harabasz index). For reporting and comparability we set unless the indices suggest .
Typical-day prototype: for cluster
in month
, the centroid intraday profile
is the typical day, and its occurrence probability is
where
is the number of days assigned to cluster
and
is the number of days in month
. These
are exactly the probabilities used in the monthly MILP aggregation.
- (4)
Reconstruction and validation.
We validate typical-day fidelity by reconstructing monthly series from prototypes:
and evaluating: (i) energy MAPE for PV and total load; (ii) peak/valley errors
(iii) time-of-peak shift (hours). Acceptance criteria are: monthly energy MAPE ≤ 5%, average-load MAPE ≤ 3%,
, time-of-peak shift ≤ 1 h.
For robustness, we also use Ward hierarchical clustering as a sensitivity check (linkage on standardized features); results are consistent with K-means and omitted for brevity.
5.2. Experimental Results and Analysis
To validate the superiority of the TD3 algorithm in multi-timescale scheduling for medical applications, this study compares the convergence characteristics and scheduling performance of three deep reinforcement learning (DRL) algorithms. As shown in
Figure 6, TD3, DDPG, and SAC exhibit significant differences in handling the HIES scheduling task:
The convergence performance of the three deep reinforcement learning (DRL) algorithms—Deep Deterministic Policy Gradient (DDPG), Soft Actor–Critic (SAC), and Twin Delayed Deep Deterministic Policy Gradient (TD3)—was evaluated in the context of HIES multi-timescale scheduling. Among them, TD3 demonstrated superior stability and accuracy. DDPG achieved faster training speed but exhibited relatively lower stability, while SAC performed poorly in this specific problem setting. Therefore, TD3 is identified as a suitable and effective algorithm for addressing complex system optimization tasks in HIES.
5.2.1. Overall Performance Analysis
To evaluate the effectiveness of the proposed method, a comprehensive performance comparison was conducted against two traditional approaches: Scenario Reduction (SR) and Robust Optimization (RO), as summarized in
Table 4. The results demonstrate that the proposed method significantly improves distributed energy utilization, achieving 96.72%, compared to 82.45% for SR and 88.31% for RO. In contrast, the power outage rate for critical medical loads was reduced markedly from 2.8% (SR) and 1.5% (RO) to only 0.15%, thereby substantially enhancing the reliability of medical power supply.
5.2.2. Short-Term Scheduling Analysis
To comprehensively evaluate the short-term energy scheduling performance of the proposed method in hospital scenarios, representative days from critical seasons in 2021 were selected. Specifically, typical days in spring (moderate load), summer (air-conditioning peak), autumn (transitional season), and winter (surgical peak and influenza season) were chosen for analysis. The results are summarized in
Table 5.
Firstly, regarding photovoltaic (PV) utilization, the Scenario Reduction (SR) method relies on predefined strategies and exhibits limited responsiveness to the actual rooftop PV output. In winter, its daily PV utilization rate drops to 76.4%. The Robust Optimization (RO) approach tends to retain excessive backup capacity to guarantee worst-case power balance, resulting in a suboptimal summer utilization rate of 88.9%, which is significantly lower than the 95.7% achieved by the proposed method. By leveraging deep reinforcement learning (DRL) to dynamically model the probabilistic distribution of rooftop PV generation and incorporating Mixed-Integer Linear Programming (MILP) for short-term optimization, the proposed method maintains high PV utilization levels across all seasons.
Secondly, for energy storage system (ESS) scheduling, the SR method fails to adequately account for lithium battery longevity and the high operational standards of the medical sector (e.g., national standard: ≥4000 cycles). In winter, the average daily cycling rate reaches 2.0 cycles/day. Considering an approximate degradation rate of 0.025% per cycle, this could lead to a significant reduction in system lifespan. RO shows even higher cycling at 2.4 cycles/day, prioritizing extreme reliability at the expense of economic efficiency. In contrast, the proposed method incorporates DRL-based charge/discharge thresholds and MILP-based state-of-charge (SoC) constraints, achieving a 25–37% reduction in daily cycling frequency, thereby extending system lifespan and reducing operational costs.
In terms of critical load assurance, SR fails to fully address unexpected load surges. For instance, during winter surgical peaks, the power supply guarantee rate for operating rooms is only 94.3%, falling short of rigid hospital requirements≥99.99% for operating rooms and ≥97% for general wards. RO achieves guarantee rates above 98% throughout the year but sacrifices PV utilization and ESS longevity. The proposed method adopts a multi-objective trade-off strategy. Although the winter guarantee rate for surgical and critical departments is slightly lower at 96.8%, it effectively balances economic efficiency and reliability through optimized coordination of diesel generators and ESS dispatching.
Lastly, regarding diesel generator peak-shaving contribution, SR, relying on static rules, achieves only 37.2% contribution in winter. RO shows marginal improvement but remains limited due to conservative capacity reservation. The proposed method dynamically optimizes diesel generator output and ESS synergy via DRL, increasing the peak-shaving contribution to 42.7%. Considering a backup diesel generator capacity of 1.5 MW, this corresponds to a winter peak-shaving capability of approximately 0.64 MW, significantly enhancing the hospital’s resilience to load fluctuations and sudden high-demand events.
5.2.3. Long-Term Scheduling Analysis
At the long-term operational level, the policy network based on deep reinforcement learning (DRL) significantly enhances the coordination and adaptability of the hospital energy system by learning the dynamic relationships among rooftop photovoltaic (PV) generation, energy storage capacity, and average monthly energy demand. The results are summarized in
Table 6.
Firstly, in terms of energy storage capacity control, the DRL-based strategy demonstrates superior performance. Under the traditional Scenario Reduction (SR) method, the capacity regulation error reaches as high as 15.2%, whereas the proposed method reduces the error to 3.9%, achieving a 74.3% improvement. This indicates that the DRL-based scheduling strategy can more accurately align the state of charge (SoC) of battery storage with the hospital’s actual energy demands, thereby minimizing capacity degradation caused by overcharging or deep discharging, and ensuring continuous and stable power supply for critical load areas such as operating rooms and intensive care units (ICUs).
Secondly, the coordination mechanism between DRL and MILP rolling optimization plays a crucial role in compensating for PV generation forecast deviations. By employing DRL to predict the probabilistic distribution of rooftop PV output and integrating it with a monthly rolling MILP model, the system dynamically adjusts reserve capacity and energy storage dispatch strategies. Compared to the SR method, the proposed approach improves the forecast deviation compensation rate by 38.6%, effectively smoothing out the fluctuations in monthly renewable output and enhancing both PV utilization and the hospital’s overall energy self-sufficiency.
Lastly, in terms of real-time power tracking accuracy, the MILP model performs hourly rolling optimization of the combined output from diesel generators, PV systems, and energy storage, enabling highly efficient load response. Through this fine-grained multi-source dispatching, the real-time scheduling deviation is strictly controlled within 1.5%, with the average monthly power tracking accuracy improved by 52.9%. These results clearly demonstrate the proposed method’s advantage in ensuring real-time responsiveness and power stability for high-sensitivity medical loads—especially operating rooms and mission-critical equipment—thus providing nearly uninterrupted and highly reliable energy support for hospital operations.
5.2.4. Case-Based Scheduling Analysis
To validate the practical applicability of the proposed method within a hospital-integrated energy system, a case study was conducted using the energy infrastructure of a newly established campus of a tertiary hospital. The system includes a 2000 kW rooftop photovoltaic array, a 1500 kW diesel generator, and a 3600 kWh hybrid energy storage system composed of lithium-ion batteries and flywheels. The key load zones cover the surgical suite, the intensive care unit (ICU), and the general wards.
Figure 7 illustrates the reward evolution trend during the training process of the deep reinforcement learning agent. In the initial phase (0–150 episodes), the reward exhibits significant fluctuations as the agent continuously explores and gradually learns the seasonal load patterns and responses to extreme events. After 240 episodes, the reward begins to show a clear convergence trend. By episode 400, the average reward fluctuation narrows to within ±1.17, indicating that the agent has effectively learned an optimal scheduling policy.
Notably, at episode 150, the agent successfully identifies a sharp increase in winter heating demand, resulting in a 23% boost in reward. Furthermore, by episode 320, the agent significantly improves the storage reserve strategy during surgical peak hours, leading to a stabilization of the reward curve.
Figure 8 illustrates the dynamic annual capacity adjustments of the diesel generator and energy storage system. For the diesel generator, capacity is increased by 18% during the winter months (December to February) to meet the elevated heating demand typical of the cold season. For the energy storage system, capacity is expanded by 15% during flu season (February to March and October to November) to address emergency backup requirements and patient surges. The total annual capacity variation amounts to only 0.37 GWh, which represents just 3.2% of the system’s designed capacity. This performance is significantly better than that of traditional empirical methods, which exhibit a variation of up to 5.6%.
Figure 9 illustrates the power balance results of the Hospital Integrated Energy System (HIES) on representative days across each month. It is evident from the figure that the hospital’s load structure exhibits significant seasonal variations, and the energy contribution from different sources varies accordingly. The analysis is detailed as follows:
First, grid-purchased electricity serves as the base load provider throughout the year. Its output increases notably in the winter and early spring months (January–March and December), reflecting the hospital’s reliance on highly reliable energy sources during heating seasons and periods of unexpected high demand. For instance, in January, February, and December, the share of grid electricity reaches its annual peak, ensuring uninterrupted power supply for critical zones such as operating rooms and intensive care units (ICUs).
Second, photovoltaic (PV) generation demonstrates pronounced seasonal variability. From late spring to early autumn (April–September), PV output increases steadily, peaking in July and August to meet the surge in air-conditioning demand. During this period, PV contributes more than 25% of the total load, significantly improving renewable energy utilization, while reducing both the cost of grid electricity and carbon emissions.
The energy storage system (ESS) plays a key role in peak shaving, valley filling, and emergency backup across all months. During periods of sharp load fluctuations in summer and winter, the ESS effectively smooths intraday power supply–demand dynamics through flexible charge–discharge strategies. It also provides millisecond-level backup response in extreme scenarios, such as sudden emergency room surges. Experimental results indicate that the ESS’s discharge contribution increases significantly in July and December, ensuring continuous and secure hospital power supply.
Third, the diesel generator acts as a crucial supplementary source to balance the system, especially when distributed energy and ESS are insufficient to meet the demand. Slight increases in the share of grid electricity during summer and winter reflect the system’s compensatory mechanism under extreme climate conditions. Notably, under the coordinated scheduling of PV and ESS, the overall grid power consumption is significantly reduced compared to traditional management approaches, improving both the self-sufficiency and economic efficiency of the hospital’s energy system.
In addition, other renewable energy sources (Other RE) introduce greater flexibility and sustainability to the system. Their contribution increases in wind-rich months (e.g., spring and autumn), providing extra redundancy for system dispatching.
Finally, the total load (black curve) fluctuates throughout the year, with higher levels in summer and winter. The proposed multi-energy coordinated scheduling strategy ensures the rational allocation and efficient utilization of various energy sources under different seasonal and representative day scenarios. This supports secure, economical, and low-carbon operation of the HIES. Overall, the proposed dispatching framework effectively accommodates highly variable hospital loads and frequent extreme events, guaranteeing the continuity of critical medical services and the overall efficiency of system operation.
5.2.5. Supplementary Comparison with Hammerstein Model
To further illustrate the effectiveness of the proposed DRL–MILP framework, a comparative analysis was conducted against the Hammerstein model, which is a classical approach widely used for nonlinear dynamic system identification and scheduling. As reported in prior studies, the Hammerstein structure is effective in modeling weak nonlinearities and dynamic responses of energy systems under simplified operating conditions. However, when applied to the hospital integrated energy system (HIES), its performance is constrained by the high dimensionality of decision variables and the stringent medical reliability requirements.
Under the same hospital dataset, the Hammerstein-based scheduling model achieved acceptable prediction accuracy for short-term dynamics but exhibited slower convergence and limited adaptability to cross-timescale coordination. Specifically, the average renewable utilization rate was approximately 89.7%, and the critical load interruption rate remained at 1.2%, both of which are inferior to the proposed DRL–MILP framework (96.7% and 0.15%, respectively). Moreover, the Hammerstein approach required substantially longer computation times due to iterative parameter estimation, whereas the proposed method converged more rapidly and ensured strict compliance with medical operational constraints.
These findings indicate that, while the Hammerstein model provides valuable insights into nonlinear response characteristics, its applicability in HIES multi-timescale scheduling is limited. In contrast, the proposed DRL–MILP hybrid framework not only ensures high accuracy and computational efficiency but also guarantees critical load reliability under complex hospital scenarios.
5.2.6. Generalization and Applicability Validation
To further verify the generalizability and practical applicability of the proposed scheduling framework, three types of medical institutions were selected for systematic comparative analysis: a newly established tertiary hospital, a community hospital, and a specialized outpatient center.
The results of
Table 7 demonstrate the following:
In the new tertiary hospital, the annual duration of surgical interruptions was significantly reduced to just 0.3 h/year using the proposed method, compared to 4.2 h/year under traditional approaches.
In the community hospital scenario, the PV utilization rate reached 95.1%, significantly outperforming the traditional method’s 86.7%.
For the specialized outpatient center, the equipment expansion cost was reduced by 18% compared to the baseline solution.
These results clearly indicate that the proposed scheduling framework can enhance system-wide economic performance by 21–28%, while strictly adhering to hard medical safety constraints (e.g., ensuring ≥99.99% power reliability for operating rooms).
In terms of economic benefits, the proposed framework can save approximately CNY 2.86 million annually in equipment expansion costs and reduce direct economic losses due to power outages by around CNY 9.2 million. The results are summarized in
Table 8.
From the perspective of technical scalability, the system supports 5G edge computing deployment with a decision-making latency of less than 50 milliseconds, and has been certified for electromagnetic compatibility (EMC) by national medical equipment standards. These features demonstrate strong engineering feasibility and broad potential for large-scale application.
5.2.7. Sensitivity Analysis
The robustness of the proposed multi-timescale scheduling framework was validated through four categories of sensitivity experiments: energy storage capacity, DRL hyperparameters, medical load forecasting errors, and electricity price fluctuations. The key findings are summarized as follows.
As shown in
Table 9, system performance is strongly influenced by storage sizing. Expanding capacity from 1800 to 3600 kWh markedly improves renewable utilization (from 89.23% to 96.72%) and reduces the critical load interruption rate (from 0.45% to 0.15%). However, beyond 4200 kWh, improvements diminish (<0.05% reduction per 600 kWh), and the ROI period begins to increase due to higher investment and maintenance costs. Thus, the economic optimization zone is identified at 3000–4200 kWh, where both reliability and cost savings are balanced.
- 2.
DRL Hyperparameter Sensitivity Analysis
The results in
Table 10 indicate that learning rate is the most sensitive hyperparameter: values between 0.0003–0.0007 achieve stable convergence, while larger rates (≥0.001) lead to instability. Batch size shows an optimal range of 128–256, ensuring low variance while avoiding overfitting. The network architecture with two hidden layers [256,128] provides the best trade-off between accuracy and computational efficiency. These findings confirm that careful hyperparameter tuning is essential for achieving stable and reliable DRL-based hospital scheduling.
- 3.
Medical Load Forecasting Error Sensitivity
Table 11 demonstrates that forecasting errors directly degrade both reliability and economic performance. When the error range increases from ±2% to ±15%, the critical load interruption rate rises from 0.12% to 0.68%, while distributed energy utilization drops from 97.23% to 91.28%. Emergency response time is prolonged (12.3 s → 35.7 s), and operating costs increase by up to 17.8%. The system remains resilient under moderate errors (≤±5%), but performance deteriorates sharply beyond ±10%, highlighting the need for accurate load forecasting and robust policy adaptation.
- 4.
Electricity Price Sensitivity Analysis
As summarized in
Table 12, higher peak-to-valley ratios substantially enhance economic benefits and demand flexibility. When the ratio increases from 1.5:1 to 4:1, average daily costs fall from 8234 CNY to 5734 CNY, while PV utilization improves from 94.32% to 97.52%. Meanwhile, ESS cycling frequency increases (1.2 → 2.4 cycles/day), and the peak-shaving effect nearly doubles (28.4% → 54.6%). These results demonstrate that dynamic pricing can incentivize renewable energy integration and improve system efficiency, though at the cost of higher battery wear.
The sensitivity results confirm that storage capacity and forecasting errors are the dominant factors influencing system reliability, while electricity price spread primarily drives economic optimization. DRL hyperparameters, though secondary, significantly affect training stability and policy robustness. Collectively, these findings emphasize that optimal deployment of the proposed framework requires balanced storage sizing, accurate forecasting, and careful tuning of DRL models, along with consideration of tariff structures to align economic incentives with operational reliability in hospital energy systems.