1. Introduction
Driven by the high penetration of renewable integration and the accelerating expansion of hydrogen utilization, integrated electricity–heat–hydrogen energy systems (IEHESs) in industrial parks increasingly require the coordinated scheduling of electricity, heat, and hydrogen under conditions of strong multi-energy coupling, pronounced intertemporal dynamics, and substantial operational uncertainty [
1]. The integration of hydrogen production, storage, and conversion further expands the flexibility space of industrial energy systems, but also introduces additional cross-sector coupling paths and state variables that must be coordinated with electricity and heat flows in real time [
2,
3]. In this context, dispatch is no longer a conventional single-energy cost minimization problem, but a receding-horizon decision process that must repeatedly balance economy, carbon performance, and hard-constraint feasibility [
4]. Existing studies have established an important foundation for the modeling and operation of integrated energy systems [
5]. Wang et al. used a coordinated scheduling framework to address the joint operation of electricity, heat, and hydrogen systems considering energy storage in heat and hydrogen pipelines, showing that explicitly modeling multi-energy flow dynamics can reveal additional operational flexibility and economic value [
6]. Pan et al. used a two-stage optimization model to address integrated energy scheduling with power-to-gas and integrated demand response, showing that demand-side flexibility and multi-energy conversion can jointly improve system economy and low-carbon performance [
7].
To improve adaptability under uncertainty, receding-horizon and MPC-based methods have been increasingly applied in integrated energy systems. Lv et al. [
8] used model predictive control to address robust scheduling of a community integrated energy system with operational flexibility, demonstrating that rolling optimization can improve adaptability to renewable and load uncertainty while maintaining explicit constraint handling. Liu et al. [
9] used a bi-level MPC-based dispatch-and-control strategy to address dynamic response performance in community integrated energy systems, showing that multi-timescale rolling optimization can better coordinate upper-level economic scheduling and lower-level real-time control. Wu et al. [
10] used economic MPC to address multi-timescale operation of integrated energy systems, further showing that predictive control can exploit temporal decomposition to improve control-oriented economic performance. In the hydrogen-coupled setting, Appino et al. [
11] used stochastic MPC to address real-time operation of an integrated electricity–hydrogen virtual power plant, illustrating the value of predictive control when renewable generation and market prices are uncertain. Fan et al. [
12] used a distributionally robust adaptive MPC framework to address renewable uncertainty in integrated energy-system scheduling, confirming that uncertainty-aware receding-horizon formulations can enhance both robustness and economy relative to deterministic scheduling. Recent review studies on MPC in microgrids further indicate that MPC has evolved from conventional deterministic formulations toward robust, stochastic, adaptive, and learning-assisted control architectures, mainly to improve uncertainty handling and real-time operational adaptability [
13]. However, these advances also reveal two persistent limitations in energy-system dispatch. First, the performance of MPC remains highly dependent on the quality of the internal predictive model. Second, robust or uncertainty-aware MPC may improve feasibility but can become conservative when the uncertainty representation is not tightly connected to the actual prediction residuals. Therefore, optimization-based rolling dispatch still requires a predictive model that can represent multi-energy dynamics accurately and a constraint-handling mechanism that can convert prediction uncertainty into executable safety margins.
In parallel with optimization-based scheduling, data-driven control has become an important research direction for integrated energy systems [
14]. Recent studies have further shown that data-driven and hybrid modeling methods are increasingly used in complex integrated power and energy systems, especially when strong coupling, limited sensing, and uncertainty make purely mechanistic modeling insufficient [
15,
16]. These advances indicate that the latest research trend is shifting from isolated forecasting or diagnosis models toward topology-aware, uncertainty-aware, and physically interpretable learning frameworks for complex energy-system operation. Liang et al. [
17] used a Soft Actor–Critic framework to address real-time optimal scheduling of integrated energy systems for electricity, heat, and hydrogen storage, showing that the scheduling problem can be formulated as a Markov decision process and solved effectively through online policy learning. Liu et al. [
18] used a deep reinforcement learning approach to address data-driven optimal scheduling of an integrated electricity–heat–gas–hydrogen energy system considering demand-side management, showing that DRL can improve adaptability in complex multi-energy environments with flexible demand-side resources. Meanwhile, graph-based spatiotemporal learning has provided a promising direction for representing networked energy systems. Su et al. [
19] used a spatiotemporal convolutional graph neural network to address short-term load forecasting in regional integrated energy systems, showing that graph-based representations can better capture the coupling among multiple loads and meteorological factors than conventional sequence-only models. Yuan et al. [
20] used physics-embedded graph learning to address dynamic modeling of integrated energy systems, demonstrating that topology-aware neural architectures combined with physical principles can improve the interpretability and dynamic representation capability of learned energy-system models. These studies indicate the potential of learning-based methods for representing complex energy-system dynamics. However, from a control-oriented dispatch perspective, the learned model must be recursively usable inside a constrained optimizer, where multi-step prediction errors, physical feasibility semantics, and constraint margins jointly affect closed-loop performance. Although these studies have substantially advanced integrated energy scheduling, two limitations remain insufficiently addressed in the IEHES rolling dispatch setting. First, many existing studies either rely on mechanistic predictive models [
21] with limited adaptability or focus primarily on improving statistical prediction accuracy [
22], but do not explicitly align the learned dynamics with the physical feasibility semantics required by downstream dispatch optimization. As a result, a trajectory that appears accurate in the mean square sense may still deviate from the electricity/heat/hydrogen balance relationships, storage-boundary evolution, and ramping logic that define the true feasible operating region. Second, even when a learned predictive model is coupled with rolling optimization, most existing approaches still optimize under nominal state and power bounds without explicitly transforming prediction-error statistics into executable safety margins [
23,
24,
25]. In tightly coupled electro–heat–hydrogen dispatch, this omission is particularly critical because accumulated prediction residuals can propagate across storage states, conversion pathways, and network balances, thereby causing nominally feasible decisions to become infeasible during real closed-loop operation.
To further clarify the distinction between the proposed work and recent studies,
Table 1 summarizes the representative literature published in recent years from the perspectives of method, uncertainty handling, system type, objective, data source, and main limitation.
As shown in
Table 1, recent studies have made important progress in robust MPC, multi-timescale optimization, data-driven MPC, graph-based forecasting, and physics-embedded energy-system modeling. However, most existing methods either focus on optimization with predefined system dynamics, uncertainty-aware scheduling without topology-aware learned transitions, or graph-based prediction without direct integration into a hard-constrained economic MPC. In contrast, the proposed GraphWorldModel_MPC explicitly embeds the learned graph spatiotemporal transition model into the MPC as a transition constraint and further converts empirical prediction residuals into quantile-tightened state and power bounds. Therefore, the main difference in this work lies in the MPC-level integration of learned multi-energy transition dynamics, constraint-aligned rollout training, and residual-aware feasible-region tightening for industrial-park IEHES dispatch.
Based on the above review, the research gap addressed in this paper can be summarized as follows: existing rolling dispatch studies for integrated energy systems still lack a unified framework that can simultaneously learn topology-aware multi-energy dynamics, preserve the physical feasibility semantics of multi-step predicted trajectories, and explicitly account for residual prediction uncertainty in hard-constrained economic MPC. This gap is particularly important for industrial-park IEHESs, where electricity–heat–hydrogen coupling, storage carry-over dynamics, and renewable/load forecast errors jointly affect the feasibility and economy of online dispatch. Existing graph-based or learning-based energy-system studies have mainly focused on forecasting accuracy, representation learning, or policy learning, whereas the present work focuses on the interface between learned multi-step transition prediction and hard-constrained economic MPC. In this setting, the learned model must not only predict future states accurately, but also produce rollout trajectories that remain usable inside a constrained optimizer. Therefore, the distinction of this work is not the isolated use of graph learning, MPC, or physics-consistency terms, but their integration into a rolling dispatch pipeline in which topology-aware prediction, constraint-aligned rollout training, and quantile-based constraint tightening are jointly evaluated under the same MPC interface.
More specifically, the MPC-related novelty of this work lies in the formulation of a learned-dynamics-constrained economic MPC for industrial-park IEHES dispatch. Instead of using the learned model only as an external forecaster, the graph spatiotemporal world model is embedded into the MPC as an explicit transition constraint, so that each candidate control sequence is evaluated through recursively generated electricity–heat–hydrogen state trajectories. In addition, the prediction residuals of the learned transition model are converted into quantile-based tightened state and power bounds, which makes the feasible region of the MPC directly aware of model uncertainty. Therefore, the proposed method differs from a simple combination of graph learning and MPC; the learned dynamics determine the admissible predicted trajectories, while the error-aware tightening determines the executable constraint margins used by the optimizer. This provides a concrete MPC-level distinction from conventional nominal-model MPC, robust adaptive MPC, and sequence-only learning-based MPC.
The objective of this paper is therefore to develop a graph spatiotemporal world-model-driven rolling MPC framework for low-carbon economic dispatch of industrial-park IEHESs, with the aim of improving multi-step predictive fidelity, closed-loop feasibility, and economic-carbon performance under forecast uncertainty. Accordingly, the central research question of this paper is: can a topology-aware learned transition model, when embedded as an explicit transition constraint in a hard-constrained economic MPC, improve multi-step prediction accuracy, closed-loop feasibility, and economic-carbon dispatch performance compared with conventional nominal-model MPC, robust adaptive MPC, and sequence-only learning-based MPC? This question is tested through three measurable aspects: 24-step state-transition prediction accuracy, monthly operating cost and carbon emissions, and dispatch feasibility under nominal and perturbed operating conditions. Therefore, the contribution of this work is not limited to proposing another forecasting model or another MPC variant, but lies in verifying whether learned graph-based multi-energy dynamics can be effectively translated into executable MPC constraints for industrial-park IEHES dispatch. The specific contributions are summarized as follows.
A graph-based IEHES representation is constructed to encode the directed electricity–heat–hydrogen coupling topology. Based on this representation, a graph spatiotemporal world model is developed to learn multi-step state transitions by combining topology-aware message passing and temporal rollout modeling.
A constraint-aligned training strategy is introduced for the learned world model. In addition to conventional multi-step prediction loss, balance-, storage-boundary-, and ramping-consistency terms are incorporated to make the predicted trajectories more compatible with the feasibility requirements of downstream rolling optimization.
A learned-dynamics-constrained economic MPC formulation is developed, in which the graph spatiotemporal world model is embedded as an explicit transition constraint rather than used only as an external predictor. Under this formulation, the optimizer searches the control sequence subject to recursively learned electricity–heat–hydrogen state evolution, multi-energy balance constraints, device limits, ramping constraints, and DR contractual constraints. In addition, empirical quantiles of rolling prediction residuals are transformed into tightened state and power bounds, making the MPC feasible region directly aware of model uncertainty.
The remainder of this paper is organized as follows.
Section 2 formulates the IEHES dispatch problem, including the system model, state/control/exogenous variables, operating constraints, and the finite-horizon economic dispatch objective.
Section 3 presents the graph spatiotemporal world model and its training objective.
Section 4 introduces the world-model-driven rolling MPC framework and the quantile-based safety-tightening mechanism.
Section 5 reports comparative, robustness, ablation, and sensitivity results based on an industrial-park case in Inner Mongolia. Finally,
Section 6 discusses the limitations of the proposed framework, and
Section 7 concludes the paper and outlines future research directions.
3. Graph Spatiotemporal World Model
To provide an intuitive overview of the proposed predictive architecture before presenting the mathematical details,
Figure 2 illustrates the overall structure of the graph spatiotemporal world model used in this study. Starting from historical state, control, and exogenous-input trajectories, the model first constructs topology-aware node representations based on the directed IEHES graph, then extracts spatial coupling and temporal dependency through graph encoding and sequence modeling, and finally performs one-step state decoding and autoregressive multi-step rollout. In addition to the conventional prediction loss, physics-consistency terms are incorporated into training so that the learned trajectories remain better aligned with electricity–heat–hydrogen balance, storage-boundary evolution, and ramping semantics required by downstream dispatch optimization.
3.1. Problem Setting and Input–Output Formulation
The true transition dynamics
in Equation (1) are generally unknown or only partially captured by simplified mechanistic models. In industrial-park IEHESs, the system evolution is jointly influenced by multi-energy coupling, storage carry-over effects, demand response adjustments, and time-varying exogenous inputs. Under such conditions, a purely mechanistic model may be insufficiently adaptive, whereas a purely statistical predictor may fail to preserve the operational semantics required by downstream dispatch optimization. To address this issue, a parameterized graph spatiotemporal world model
is constructed to approximate the underlying transition process from historical operational data. Rather than directly generating dispatch decisions, the model is trained to predict future system states and then serves as the internal predictive kernel of the rolling MPC described in
Section 4. This design is consistent with the core problem setting of this paper, in which the dispatch task is solved through learned transition modeling and hard-constrained rolling optimization rather than end-to-end policy learning.
Let the historical window length be
. At time
, the input to the world model consists of the recent state trajectory, control trajectory, and exogenous trajectory over the interval
. The one-step prediction is written as
where
denotes the predicted next state. Compared with a one-shot vector predictor, this formulation explicitly exploits the temporal dependence created by storage dynamics, ramping constraints, and inter-energy coupling, and therefore is better suited to the receding-horizon nature of the IEHES dispatch problem. Moreover, because the system has already been represented as a graph
with electricity, heat, and hydrogen carrier edges, the state trajectory entering the world model is not treated as an unordered sequence. Instead, the local states of individual devices and hubs are mapped to node-level features and then processed by graph-based spatial encoders before temporal aggregation. This allows the predictor to preserve the physical topology of the IEHES rather than flattening it into a purely sequence-based approximation.
3.2. Topology-Aware Graph Representation
Given the directed graph , each node is associated with a feature vector at time as defined in Equation (6). The node features include local power-related quantities, storage states, heat-related quantities, market-dependent exogenous information, and demand/renewable prediction inputs. Since different device types do not share identical physical attributes, missing entries are handled through masking or zero-padding so that all nodes can be embedded into a common latent space. In this way, the graph representation remains compatible with heterogeneous system components while preserving a unified tensor structure for training. The purpose of this representation is not merely to reduce dimensionality, but to encode the fact that the operating state of each device is influenced by its neighbors through directed energy-transfer paths and physical coupling relations. In the industrial-park IEHES, such topology-dependent interactions are fundamental. For example, a change in CHP dispatch simultaneously affects the electricity and heat subsystems, whereas electrolyzer operation links electricity consumption to hydrogen production and storage evolution.
To extract topology-aware embeddings, the graph encoder performs layers of directed attention-weighted message passing at each time step . First, the input feature of node is projected into the latent space to obtain the initial hidden representation. Then, at the -th graph layer, each incoming neighbor sends a message to node through a learnable linear transformation of its current hidden feature. To distinguish the unequal influence of different neighbors and coupling paths, an attention score is further computed for each directed edge based on the current hidden features of both the source node and the target node, together with the corresponding edge attribute. Here, the edge attribute is used to distinguish electricity, heat, hydrogen, or cross-carrier coupling relations, so that the graph encoder can preserve the physical meaning of heterogeneous interconnections rather than treating all edges identically. The resulting attention scores are normalized over all incoming neighbors of node , thereby yielding attention coefficients that quantify the relative contribution of each incoming message.
Accordingly, the hidden representation of node at the next layer is updated by combining two parts: the transformed self-feature of node and the weighted aggregation of all incoming neighbor messages. The self-feature term preserves the intrinsic operating condition of the target node, while the aggregated message term captures the influence propagated from structurally connected components through directed energy-transfer paths. In this way, the message-passing mechanism explicitly reflects the physical topology of the IEHES. Different coupling paths can contribute unequally to the target-node embedding, and this is particularly important in the present problem because the operational dependence among components is not spatially uniform. For instance, CHP-related electricity–heat coupling, electrolyzer-related electricity–hydrogen coupling, and storage-related intertemporal support all have different structural roles in the system and should therefore be weighted differently during representation learning.
Formally, denoting the hidden representation of node
at the
-th layer and time step
by
, the update can be written as
where
denotes the neighbor set of node
,
and
are trainable parameters,
denotes a normalized edge weight or attention coefficient, and
is a nonlinear activation function. Through repeated aggregation over neighbors, the graph encoder enables each node representation to absorb both local state information and the influence of structurally related components. This mechanism is particularly useful in multi-energy systems because operational coupling is not uniform across all state dimensions, but is concentrated along specific physical pathways. The output of the graph encoder at time
is denoted by
, which summarizes the topology-aware system representation at that time step.
3.3. Temporal Encoding and Multi-Step Rollout
Although graph message passing captures the spatial coupling structure at each time step, it does not itself account for the temporal dependencies induced by storage carry-over, device ramping, and time-varying exogenous inputs. Therefore, the sequence of graph representations over the historical window is further processed by a temporal encoder. Let the graph-encoded sequence over
be
. A temporal encoder, implemented as either a gated recurrent unit (GRU) or a Transformer-style sequence model, maps this sequence into a compact latent representation
namely:
This latent representation summarizes the recent operating trajectory of the industrial-park IEHES and serves as the basis for state decoding. The next state is then obtained by a linear decoder
where
and
are trainable parameters. The role of the temporal encoder is not only to improve one-step accuracy, but also to preserve intertemporal information required for stable multi-step rollout. In the present IEHES dispatch problem, temporal encoding is necessary because the same instantaneous graph state may correspond to different future evolutions depending on previous storage charging/discharging directions, ramping history, and lagged exogenous variations. For example, two operating points with similar electricity–heat–hydrogen balances can lead to different subsequent SOC, TES, or hydrogen-inventory trajectories if their preceding trajectories are different. Therefore, the temporal encoder acts as a dynamic memory module that transforms the graph-encoded historical trajectory into a latent state representation for transition prediction. Without this temporal module, the predictor degenerates into a snapshot-based graph model and tends to lose the carry-over effects of storage dynamics and device ramping, which may lead to accumulated errors during recursive rollout. Since the downstream MPC relies on future state trajectories rather than isolated one-step predictions, the predictor must maintain meaningful structure under recursive application.
Accordingly, during online rolling dispatch, the learned world model is used in an autoregressive fashion over the prediction horizon
. Starting from the current observed state
, the future state trajectory is generated as:
where
denotes the forecast exogenous input at future step
.
It should be noted that the multi-step rollout is used not only during online MPC prediction, but also during world model training. During training, the predicted state at one rollout step is recursively fed into the next prediction step together with the recorded control and exogenous inputs, and the resulting multi-step trajectory is supervised by the corresponding ground-truth trajectory. This training strategy reduces the mismatch between one-step supervised learning and the recursive use of the model in MPC. In other words, the model is explicitly trained under the same rollout mechanism as that used by the downstream rolling optimizer, which improves long-horizon stability and mitigates error accumulation.
This recursive rollout is the key interface through which the world model is connected to the MPC layer. In other words, the world model does not output a final dispatch decision. Instead, it supplies a learned predictive state-transition operator that allows the economic optimizer to evaluate candidate control sequences under hard constraints.
3.4. Physics-Consistent Training Objective
A world model trained only by minimizing statistical prediction error may generate trajectories that are numerically close to historical data while still violating the physical semantics required by dispatch optimization. For example, a predictor may achieve low mean square error but still produce state trajectories that are inconsistent with electricity/heat/hydrogen balance, storage-boundary evolution, or device ramping behavior. To mitigate this mismatch, the training objective combines a multi-step prediction loss with a set of physics-consistency penalties that are deliberately aligned with the operational constraints introduced in
Section 2. In this paper, the physics-consistency terms are used as a constraint-aligned training component rather than as an independent methodological novelty. Their purpose is to make the learned rollout trajectories more compatible with the feasibility semantics required by the downstream hard-constrained MPC, especially when the model is recursively applied over a multi-step prediction horizon. Therefore, this paper does not claim physics-consistency regularization as a new theoretical contribution. Instead, its role is empirically evaluated through the ablation study in
Section 5.6, where the same graph architecture and MPC formulation are compared with and without this constraint-aligned training component.
Let the training rollout length be
. The multi-step prediction loss is defined as:
where
is the weight assigned to the k-step prediction error. To regularize the rollout trajectory with respect to the system’s physical feasibility semantics, the physics-consistency term is written as:
where
penalizes electricity/heat/hydrogen balance residuals,
penalizes storage-boundary deviations, and
penalizes ramping inconsistency. The overall training objective is then:
The three physics-consistency terms play complementary roles.
encourages the rollout trajectory to remain close to the conservation structure of multi-energy flows,
discourages boundary-inconsistent storage evolution, and
suppresses unrealistic step-to-step transitions of controllable variables. Together, these terms make the learned world model better aligned with the feasible operating manifold of the IEHES and therefore more suitable as a predictive kernel for downstream rolling optimization. The hyperparameters of the world model were selected using the validation segment described in
Section 5.1. The hidden dimension, number of graph layers, number of GRU layers, dropout rate, learning rate, batch size, and loss weights were tuned within a limited candidate range, and the final setting was selected according to the validation 24-step prediction error and rollout stability. During training, the validation loss was monitored after each epoch, and the model checkpoint with the lowest validation rollout error was retained for subsequent MPC evaluation. Gradient clipping and weight decay were used to reduce unstable parameter updates and overfitting. The training process was regarded as converged when the validation rollout loss no longer showed a visible decreasing trend over consecutive epochs.
For computational complexity, the graph encoder mainly scales with the number of directed edges and graph layers, while the temporal encoder scales with the historical window length and hidden dimension. Since the studied industrial-park IEHES has a moderate number of devices and coupling edges, the trained world model can generate 24-step rollout trajectories efficiently for the downstream MPC. To improve implementation-level clarity, the main variables, constraints, graph components, and loss terms used in the proposed framework are further specified as follows. The state vector contains dispatch-related power variables and storage states, including grid import, CHP electric and thermal outputs, gas-boiler output, electrolyzer power, battery SOC, TES state, hydrogen inventory, and storage charging/discharging quantities. The control vector contains the directly optimized dispatch actions, including battery charging/discharging power, CHP output commands, gas-boiler thermal output, electrolyzer power, TES and hydrogen-storage charging/discharging actions, and DR activation. The exogenous vector contains forecast PV output, wind output, electric load, thermal load, hydrogen demand, electricity price, gas price, and carbon-related price signal. In the graph implementation, each physical device, storage unit, energy hub, or aggregated load is treated as a node, and each electricity, heat, hydrogen, or cross-carrier coupling path is treated as a directed edge with a carrier-type attribute. Missing node attributes are handled by zero-padding and masking. During training, the multi-step prediction loss is computed over all selected state variables over the 24-step rollout horizon. The balance-consistency loss is calculated from electricity-, heat-, and hydrogen-balance residuals, the boundary-consistency loss penalizes predicted storage states outside admissible SOC/TES/hydrogen-inventory ranges, and the ramping-consistency loss penalizes unrealistic step-to-step variations in controllable outputs. During MPC implementation, the equality constraints include learned transition dynamics and multi-energy balance constraints, while the inequality constraints include device output bounds, storage bounds, ramping limits, DR contractual limits, and quantile-tightened bounds for selected critical variables.
For readability and reproducibility, the main architectural choices and training hyperparameters of the graph spatiotemporal world model are summarized in
Table 3.
4. World-Model-Driven Rolling MPC and Quantile-Based Safety Tightening
To clarify how the learned transition model is integrated into online dispatch,
Figure 3 presents the overall world-model-driven rolling MPC framework adopted in this study. At each control step, the current measured state and forecast exogenous sequence are used to generate future state trajectories through the graph spatiotemporal world model, while empirical quantiles of rolling prediction errors are used to tighten critical state and power bounds. The tightened trajectories and constraints are then embedded into a hard-constrained economic MPC, whose first optimal control input is implemented and fed back to the physical system in a receding-horizon manner.
4.1. Interface Between the Learned World Model and Rolling Dispatch
After the graph spatiotemporal world model
is trained as described in
Section 3, it is embedded into a receding-horizon model predictive control framework to support online dispatch of the industrial-park IEHES. The learned model provides a data-driven approximation of the system transition, while the MPC layer computes economically optimal control actions under explicit hard constraints. This separation allows the proposed framework to combine adaptive dynamic prediction with physically feasible rolling optimization.
At each control instant
, the current measured system state
is treated as known, and the future exogenous sequence over the prediction horizon is provided by the forecast vector
. The learned world model is then used to recursively propagate the future state trajectory under a candidate control sequence. Let the prediction horizon be
. The predicted state trajectory is generated according to
Equation (21) defines the internal predictive kernel of the MPC layer. At each control step, the future trajectory generation starts from the current measured state. The forecast exogenous sequence, including renewable availability, electricity/heat/hydrogen demand, and market-related signals, is provided over the MPC horizon. For a candidate control sequence evaluated by the MPC optimizer, the current state is first mapped into node-level graph features together with the corresponding control and exogenous inputs. The graph encoder extracts topology-aware electricity–heat–hydrogen coupling representations, while the temporal encoder summarizes the recent historical operating window. The decoder then predicts the next state. This predicted state is appended to the rolling input window and recursively used to generate the following state until the end of the prediction horizon. Therefore, the MPC optimizer evaluates each candidate control sequence based on the future state trajectory recursively generated by the graph spatiotemporal world model. In this way, the dispatch optimizer plans over a learned transition model rather than a fixed mechanistic approximation, while still preserving an explicit optimization structure for economic dispatch under hard constraints.
4.2. Learned-Dynamics-Constrained Economic MPC Formulation
Given the learned transition model in Equation (21), the receding-horizon dispatch problem at time
is formulated as a finite-horizon constrained optimization problem. Let the decision variable be the control sequence
The objective is to minimize the cumulative operating cost over the prediction horizon while discouraging excessive control variation. The economic MPC objective is written as
where
is a regularization coefficient used to suppress excessive step-to-step control variation.
To make the role of the learned transition dynamics explicit, the proposed MPC problem solved at each time step is written as a learned-dynamics-constrained finite-horizon optimization problem. The decision variable is the control sequence
, while the predicted state sequence is recursively generated by the learned graph spatiotemporal world model. The complete MPC formulation is given by
where
denotes the predicted state trajectory,
is the learned graph spatiotemporal transition model, and
,
denote the tightened state and control feasible sets. The equality constraint
is the key difference between the proposed formulation and a conventional economic MPC with a fixed nominal model. It forces every candidate control sequence to be evaluated through the learned multi-energy transition dynamics, so that the predicted SOC, TES state, hydrogen inventory, and critical power variables evolve consistently with the topology-aware world-model rollout.
Compared with classical MPC, the proposed formulation retains the standard receding-horizon structure, economic objective, and explicit operational constraints, but replaces the fixed nominal transition model with the learned graph spatiotemporal transition model . In classical MPC, the predicted trajectory is usually generated by simplified mechanistic equations or linearized nominal dynamics. Such a formulation is transparent but may be inaccurate when the IEHES exhibits strong electricity–heat–hydrogen coupling, storage carry-over effects, and forecast-driven operating variations. In the proposed MPC, the learned transition constraint allows the optimizer to evaluate control sequences under data-driven multi-step dynamics that have been trained with topology-aware message passing and constraint-aligned rollout learning. Therefore, the advantage of embedding the learned transition dynamics is not that the hard constraints are removed, but that the internal prediction model used by the constrained optimizer becomes more adaptive to coupled multi-energy dynamics. The quantile-tightened bounds are then imposed on the predicted trajectory to reduce the risk that residual rollout errors lead to violations of the original physical limits.
Although the world model is trained using both prediction loss and physics-consistency regularization, its rollout error cannot be assumed to vanish in online operation. When the learned transition model is embedded into the MPC as the internal prediction kernel, residual prediction errors may accumulate over the horizon and cause nominally feasible trajectories to violate hard constraints in the actual system. The uncertainty propagation considered here is therefore trajectory-dependent. At each prediction step, the state predicted by the world model becomes part of the input for the next rollout step. Consequently, a residual error in battery SOC, TES state, hydrogen inventory, or a critical power variable can affect the subsequent feasible charging/discharging range, conversion output, and multi-energy balance. Through CHP electricity–heat coupling, electrolyzer electricity–hydrogen conversion, and storage carry-over dynamics, local prediction errors may propagate across energy carriers and across time steps. As a result, a control sequence that is feasible under the nominal predicted trajectory may become infeasible during online execution if the predicted trajectory operates too close to the original physical limits. To address this issue, the proposed method tightens key state and power constraints according to empirical quantiles of the model’s rolling prediction errors.
Let
denote the one-step or multi-step prediction error of a storage-related state, and let
denote the prediction error of the
-th critical power-related quantity. Based on a validation set or a sliding window of rolling closed-loop residuals, the empirical
-quantile of the absolute prediction error is computed. The resulting tightening margins are defined as
where
denotes the empirical
-quantile operator. Accordingly, the core problem addressed in this paper is a finite-horizon constrained economic dispatch problem over learned system dynamics.
In the implementation, the quantile level was set to . The tightening margins were calibrated using the validation segment rather than the final test segment. For each selected variable, rolling validation samples were used, and each sample produced a 24-step rollout error sequence; therefore, the empirical 95% quantile was calculated from absolute prediction-error values. The margins were computed separately for battery SOC, TES state, hydrogen inventory, grid import, CHP output, gas-boiler output, electrolyzer power, and storage charging/discharging power. Thus, the MPC did not use a uniform tightening coefficient, but adopted variable-specific margins obtained from the corresponding empirical prediction-error distributions.
The state of charge bounds used in the MPC are then tightened from the nominal interval
to
and the power bounds of the
-th critical variable are similarly tightened as:
In practical implementation, the tightening margins are obtained from the rolling prediction residuals on the validation set. For a selected storage-related state, such as battery SOC, TES state, or hydrogen inventory, the absolute k-step prediction error is first calculated as the difference between the predicted value and the measured value over all validation samples. The empirical α-quantile of these absolute errors is then used as the tightening margin. The same procedure is applied to critical power-related variables, such as grid import, CHP output, electrolyzer power, or storage charging/discharging power. During MPC optimization, the nominal lower bound is increased by the corresponding quantile margin, while the nominal upper bound is decreased by the same margin. Therefore, the optimizer does not plan directly on the original physical boundary, but on a slightly narrower feasible interval. If the learned trajectory contains residual prediction errors during online execution, the tightened interval provides a buffer that reduces the probability of violating the original hard constraints. For example, if the nominal battery SOC bound is and the 95% empirical quantile of the rolling SOC prediction error is 0.02, the tightened SOC interval used by MPC becomes . In this way, the empirical prediction-error distribution is directly transformed into an executable safety margin. Unlike fixed safety margins selected heuristically, the tightening magnitudes are explicitly linked to the empirical error distribution of the learned world model. Therefore, the resulting margins are interpretable, data-dependent, and adjustable through the quantile level .
4.3. Tightened MPC Problem and Online Rolling Procedure
To make the comparison between non-tightened and tightened MPC clearer, the non-tightened MPC considered in this study refers to the same world-model-driven economic MPC solved with the original nominal state and power bounds. In this case, the learned transition model, economic objective, multi-energy balance constraints, device operating limits, ramping constraints, and DR contractual constraints are all kept unchanged, but no explicit prediction-error buffer is reserved around critical variables. By contrast, the tightened MPC uses the same objective function and learned world model, while replacing the nominal SOC and critical power bounds with the quantile-tightened bounds defined in Equations (27) and (28). Therefore, the difference between the two formulations lies only in the feasible region used by the optimizer. This constraint-tightening idea is consistent with robust and learning-based MPC studies, where safety margins are introduced to reduce the risk of constraint violation under model uncertainty or prediction residuals [
27].
After introducing quantile-based safety tightening, the constrained optimization problem solved at each time step is modified by replacing the nominal box constraints with the tightened bounds in Equations (29) and (30) for the selected state and power variables. The resulting optimization problem can be written as
subject to the learned transition dynamics in Equation (21), the multi-energy balance constraints in Equation (7), the device operating and ramping constraints in Equations (8)–(10), and the tightened bounds defined in Equations (27) and (28).
where
,
,
, and
denote the tightened lower and upper bounds derived from the empirical prediction-error quantiles. In practice, the tightening is applied only to those variables whose boundary violation is critical for safe operation, such as battery SOC and selected power variables directly influencing balance feasibility or ramping saturation. This selective tightening prevents excessive conservatism in variables that are less sensitive to model residuals while still providing an explicit guard against model-induced boundary crossing.
After the tightening margins are obtained, the nominal state and power bounds in the MPC problem are replaced by their tightened counterparts for selected critical variables. The control sequence remains the decision variable, while the future state trajectory is recursively generated by the learned world model under each candidate control sequence. The tightened bounds are imposed directly as hard constraints inside the finite-horizon MPC rather than as a post-processing correction. Therefore, any candidate control sequence that causes the predicted SOC or critical power variables to exceed the tightened feasible region is regarded as infeasible. By reserving an explicit buffer between the optimized trajectory and the original physical limits, the tightened MPC improves closed-loop feasibility under residual rollout errors without changing the economic objective.
The improvement brought by quantile tightening lies in the modification of the feasible region used by the MPC optimizer. Instead of allowing the predicted trajectory to operate directly on the nominal physical limits, the tightened formulation reserves an error buffer between the optimized trajectory and the original hard constraints. Therefore, even when residual rollout errors occur during online execution, the implemented trajectory is less likely to cross the true SOC or power bounds. In this sense, the tightened MPC formulation improves closed-loop feasibility not by changing the economic objective, but by making the admissible control sequences more conservative around critical operating boundaries.
5. Experimental Results
5.1. Case Study and Evaluation Protocol
The proposed framework is validated on an industrial-park integrated electricity–heat–hydrogen energy system over a monthly evaluation horizon with an hourly dispatch resolution. The case study considers coupled electric, thermal, and hydrogen demands, together with renewable energy availability from photovoltaic and wind sources. As shown in
Figure 4, the electricity, thermal, and hydrogen demand trajectories exhibit evident multi-timescale variations across the month. Meanwhile,
Figure 5 presents the corresponding photovoltaic and wind power availability, highlighting the strong diurnal intermittency of solar generation and the continuous stochastic fluctuation of wind power.
The dataset used in this study consists of hourly operational time-series data of the industrial-park IEHES. The input variables include photovoltaic and wind power availability, electricity load, thermal load, hydrogen demand, electricity price, gas price, and carbon-related price signals. The output and state variables include grid import, CHP electric and thermal outputs, gas-boiler output, electrolyzer power, battery SOC, TES state, hydrogen inventory, and storage charging/discharging quantities. To avoid information leakage in time-series prediction and rolling dispatch evaluation, the dataset was partitioned chronologically rather than randomly. Specifically, the earlier part of the historical data was used for world model training, the subsequent validation segment was used for hyperparameter selection, early stopping, and empirical prediction-error quantile estimation, and the final held-out monthly segment was used only for predictive accuracy evaluation and closed-loop dispatch comparison. All normalization parameters were calculated from the training set and then applied unchanged to the validation and test sets. Rolling samples were constructed using a 24 h historical input window and a 24-step prediction horizon, which is consistent with the MPC horizon adopted in the dispatch experiments.
To make the experimental data description more complete, the dataset used in this study is further clarified as follows. The raw records are hourly time-series data collected from the studied industrial-park IEHES, including renewable availability, electricity/heat/hydrogen demand, market-related price signals, device operating outputs, and storage states. The dataset was chronologically divided into training, validation, and held-out test segments. The training segment was used to learn the graph spatiotemporal world model; the validation segment was used for hyperparameter selection, early stopping, and empirical prediction-error quantile estimation; and the held-out monthly test segment was used only for prediction evaluation and closed-loop dispatch comparison. All normalization parameters were fitted using the training segment only and then applied unchanged to the validation and test segments. In addition, the perturbation tests were generated by applying ±5%, ±10%, and ±15% disturbances to renewable and demand inputs, while keeping the same MPC settings and device constraints across all compared methods.
Before model training, the hourly time-series data were pre-processed in a unified manner. Missing or abnormal entries were first checked according to device operating ranges and temporal continuity. When isolated missing values occurred, linear interpolation was used; abnormal points outside the physical operating bounds were clipped to the corresponding admissible range. Continuous variables with different units and magnitudes, such as MW, MW_th, SOC, hydrogen inventory, and price signals, were normalized using statistics fitted only on the training set. The same pre-processing and normalization procedure was applied to all benchmark predictors to ensure a fair comparison. For reproducibility, the historical window length, prediction horizon, node-feature construction, optimizer, learning rate, batch size, number of epochs, gradient clipping threshold, and loss weights are reported in
Table 3. All the models were trained and tested under the same chronological data split, input variables, prediction horizon, and MPC settings, so that the reported differences mainly reflect the modeling and dispatch mechanisms rather than inconsistent experimental settings.
The evaluation protocol is designed from two complementary perspectives. First, the predictive capability of the world model is assessed in terms of multi-step state-transition accuracy, which provides the basis for closed-loop receding-horizon dispatch. Second, the operational performance of the overall framework is examined from the viewpoints of economic efficiency, low-carbon operation, and dispatch feasibility over the monthly horizon. In addition to the month-scale assessment, representative days are further extracted for a more detailed analysis of intra-day operating behaviors. This protocol enables a unified evaluation of both model quality and control effectiveness under realistic time-varying operating conditions.
5.2. Predictive Accuracy of the World Model
To evaluate whether the learned dynamics are sufficiently accurate for downstream rolling optimization, the proposed GraphWorldModel_MPC was tested on the held-out monthly horizon using the same 24-step rollout length as that adopted in both training and MPC.
Table 4 reports representative one-step and 24-step prediction errors for key dispatch-related states. Overall, the learned world model maintained high predictive fidelity across both power-side variables and storage-related states, with the average 24-step RMSE remaining below 5% of the corresponding variable range. This indicates that the model preserved the dominant temporal structure of the industrial-park IEHES over the full receding-horizon window rather than only under one-step prediction.
The prediction accuracy was especially strong for storage-related states. The 24-step
R2 values of battery SOC, TES state, and hydrogen inventory reached 0.997, 0.980, and 0.995, respectively, while their corresponding 24-step RMSEs remained limited to 0.015, 0.022, and 0.008 in their own units. Such results suggest that the proposed graph spatiotemporal architecture effectively captured the intertemporal carry-over effects induced by charging/discharging dynamics and multi-energy conversion coupling. This is particularly important for the downstream rolling dispatch problem, because storage-state drift is one of the main sources of recursive prediction accumulation in long-horizon MPC. For power-related variables, the learned model also achieved satisfactory multi-step accuracy. Even at the end of the 24-step rollout, the RMSEs remained within 0.22 MW for grid import, 0.06 MW for CHP electric output, 0.08 MW_th for CHP heat output, 0.095 MW_th for gas-boiler output, and 0.07 MW for electrolyzer power. Considering the operating ranges observed in the monthly evaluation set, these errors remain moderate and do not alter the dominant dispatch pattern. Moreover, the prediction error increased gradually with the rollout horizon instead of exhibiting abrupt divergence, indicating that the physics-consistency regularization introduced in
Section 3 effectively improved the long-horizon stability of the learned dynamics.
Figure 6 presents the actual trajectories together with the 1-step and 24-step predictions for four representative variables on a typical day, including grid import, CHP electric output, battery SOC, and hydrogen inventory. Overall, the proposed world model preserves the main temporal evolution patterns over the full 24-step rollout horizon. The 1-step predictions almost overlap with the actual trajectories for all variables, while the 24-step predictions still track the dominant trends with only moderate deviations. For the power-related variables, including grid import and CHP electric output, the 24-step prediction errors become slightly more visible around local peaks, valleys, and transition intervals, but the model still captures the overall dispatch pattern well. For the storage-related states, the prediction accuracy is even higher; the battery SOC trajectory, including the upper-bound saturation period and the subsequent discharge process, is reproduced almost exactly, and the hydrogen inventory prediction also remains closely aligned with the actual state evolution throughout the day. These results indicate that the learned world model maintains sufficiently high multi-step predictive fidelity to support the downstream receding-horizon dispatch optimization.
5.3. Representative-Day and Monthly Operational Results
Figure 7 and
Figure 8 illustrate the representative-day operating behaviors of the proposed GraphWorldModel_MPC. As shown in
Figure 7, the electricity-, heat-, and hydrogen-side balances are satisfied on an hourly basis, and the positive and negative stacked components remain well matched throughout the day, indicating that the obtained dispatch solution preserves multi-energy feasibility under the coupled operating constraints. On the electricity side, the adjusted electric load is jointly supported by grid import, CHP electric output, renewable generation, and battery discharge, while part of the electricity is simultaneously allocated to the electrolyzer and battery charging during selected hours. The CHP electric output remains relatively stable, providing a firm local supply backbone, whereas grid import exhibits stronger time variation and serves as the main balancing component in response to load fluctuations and renewable intermittency.
On the heat side, the thermal demand is mainly met by the coordinated contribution of CHP heat output and the gas-fired boiler, while TES is used as an additional temporal buffer. The CHP heat output stays at a relatively stable level over the day, whereas the gas-boiler provides the main flexibility to absorb residual thermal imbalance. Around the middle of the day, TES charging becomes active, indicating that part of the available thermal supply is shifted forward for later use. During the subsequent peak-demand period, TES discharge is activated to reduce the instantaneous dependence on direct thermal generation. This behavior confirms that the proposed dispatch strategy can exploit thermal storage to reshape intra-day heat supply without violating the heat-balance constraints. The hydrogen-side behavior exhibits a similar time-shifting pattern. As shown in
Figure 7c, hydrogen demand is met by the combined effect of electrolyzer-based hydrogen production and hydrogen tank discharge, while hydrogen tank charging is activated when production exceeds immediate demand. Correspondingly,
Figure 8c shows that the hydrogen inventory first increases and then decreases, reflecting a clear store-and-release pattern across the day. This indicates that the dispatch framework effectively coordinates electricity-to-hydrogen conversion with hydrogen storage to decouple, to some extent, instantaneous production from end-use demand.
Figure 8 further reveals the intertemporal roles of the three storage resources. The battery is charged during the early hours and discharged during the evening period, with its state of charge rising rapidly to the upper bound and later decreasing to the lower bound. This behavior suggests that the battery is mainly used for short-term electricity-side arbitrage and peak support. By contrast, TES is charged around midday and discharged over a narrower evening window, highlighting its role in smoothing thermal mismatch across adjacent hours. The hydrogen tank exhibits a longer-duration inventory cycle, with charging occurring in both the early and midday periods and discharging concentrated in the later hours. Taken together, these results show that the proposed framework allocates battery, TES, and hydrogen storage to distinct yet coordinated temporal functions, thereby improving the flexibility of the integrated electricity–heat–hydrogen system.
Overall, the representative-day results demonstrate that the proposed method can simultaneously maintain strict multi-energy balance and exploit cross-carrier storage flexibility to realize coordinated intra-day dispatch. The operational patterns observed in
Figure 7 and
Figure 8 also provide a physical explanation for the favorable month-scale performance reported later, namely that the learned world-model-driven rolling MPC can transform short-term renewable availability and storage flexibility into feasible and economically effective multi-energy scheduling decisions.
5.4. Comparative Dispatch Performance
Conventional EMPC denotes a deterministic economic MPC method, in which the future state trajectory is generated by a nominal predictive model and the economic objective is optimized over the receding horizon. DRAMPC denotes a distributionally robust adaptive MPC method, which incorporates uncertainty-aware adjustment into the rolling optimization process and is therefore more conservative than deterministic EMPC under forecast mismatch. GRU-MPC replaces the nominal predictive model with a recurrent neural network predictor, but it treats the IEHES state as a sequence-only representation and does not explicitly encode the directed electricity–heat–hydrogen topology or physics-consistency constraints. These benchmark methods were implemented by following the model structures, parameter settings, and tuning principles reported in their corresponding reference papers as closely as possible. For parameters that were not explicitly specified in the original references or had to be adapted to the studied IEHES scale, the validation segment was used for selection, and the held-out monthly test segment was not used for tuning.
For fairness, all compared methods were evaluated under the same industrial-park IEHES dataset, hourly resolution, 24-step MPC horizon, device operating constraints, multi-energy balance constraints, cost-function components, and evaluation metrics. Therefore, the comparison does not rely on arbitrarily weakened baselines, but follows a reference-guided reproduction protocol under a unified dispatch environment. By contrast, the proposed GraphWorldModel_MPC uses a topology-aware graph spatiotemporal world model with multi-step rollout learning and physics-consistency regularization, and further embeds quantile-based safety tightening into the hard-constrained economic MPC. Therefore, although all compared methods follow the economic MPC principle, they differ in the internal prediction model, uncertainty-handling mechanism, and constraint-tightening strategy.
As shown in
Table 5, the proposed method achieves the lowest total operating cost of 5,757,859 RMB, outperforming Conventional EMPC, DRAMPC, and GRU-MPC by 6.07%, 3.83%, and 10.79%, respectively. This advantage can be explained by the different predictive and constraint-handling mechanisms of the compared MPC formulations. Conventional EMPC relies on nominal dynamics and therefore has limited adaptability when renewable generation, load demand, and storage states evolve under coupled electricity–heat–hydrogen interactions. As a result, it yields higher grid import, gas cost, carbon emissions, and penalty cost than the proposed method.
DRAMPC improves robustness by considering uncertainty in the rolling optimization process, and therefore performs better than Conventional EMPC in terms of total cost and penalty cost. However, its robust treatment tends to be conservative and does not explicitly learn the structured spatiotemporal dynamics of the IEHES. Consequently, although DRAMPC reduces constraint-related penalties relative to Conventional EMPC, it still produces higher grid import, gas cost, and carbon cost than GraphWorldModel_MPC.
GRU-MPC introduces a learning-based predictive model and can capture part of the temporal dependency in the operational data. Nevertheless, its sequence-only structure ignores the directed topology and cross-carrier coupling paths among the electricity, heat, and hydrogen subsystems. This limitation leads to less accurate multi-step rollout trajectories and weaker constraint satisfaction during closed-loop dispatch, as reflected by its highest total cost, highest carbon emissions, and largest penalty cost among the compared methods.
By contrast, the proposed GraphWorldModel_MPC combines topology-aware graph encoding, temporal rollout prediction, physics-consistency regularization, and quantile-based safety tightening. These components improve the quality of predicted trajectories and reduce the probability of constraint violation during MPC execution. Therefore, the proposed method achieves lower grid import, gas consumption, carbon emissions, and penalty cost simultaneously. The results indicate that the performance gain does not come merely from using an economic MPC objective, but from improving the predictive kernel and feasibility-preserving mechanism embedded inside the MPC formulation.
5.5. Robustness, Ablation, and Sensitivity
To investigate the robustness of the proposed framework under forecast uncertainty, multi-source uncertainty tests were conducted over the monthly evaluation horizon. In the implementation, forecast errors were introduced into photovoltaic output, wind output, electric load, thermal load, and hydrogen demand, respectively. For each uncertainty level, namely ±5%, ±10%, and ±15%, random relative error factors were independently generated within the corresponding uncertainty range and applied to the above renewable and demand inputs. The rolling MPC dispatch was then repeatedly evaluated under the generated uncertainty realizations, and the results reported in
Table 6 are the averaged values over these repeated tests. Therefore, the reported robustness results reflect the average closed-loop dispatch performance under joint renewable-side and multi-energy demand-side uncertainty, rather than a single deterministic perturbation case.
Table 5 shows that the proposed GraphWorldModel_MPC consistently exhibits the smallest performance degradation among the compared methods. As the uncertainty level increases from ±5% to ±15%, the monthly total cost of the proposed method rises from 5.834 million RMB to 6.018 million RMB, corresponding to a relative increase of 1.32–4.52% over the nominal case, while the penalty cost remains limited and the dispatch violation rate stays close to zero. By contrast, DRAMPC shows a larger cost increase and a higher violation rate under stronger uncertainty, and GRU-MPC degrades more visibly, especially at the ±15% level. These results indicate that the proposed framework preserves stronger closed-loop robustness against multi-source forecast mismatch, which can be attributed to its more accurate multi-step state prediction, learned transition dynamics embedded in the MPC, and quantile-based safety tightening.
After the robustness tests, ablation studies were further conducted to examine the contribution of the main modeling components. Specifically, the graph encoder in the world model is replaced by a sequence-only recurrent predictor with the same MPC layer, denoted as SeqWorldModel_MPC. As reported in
Table 7, removing the graph representation increases the monthly total cost from 5.758 million RMB to 5.913 million RMB, together with higher grid import, higher equivalent carbon emissions, and a larger penalty cost. In addition, removing temporal encoding further increases the total cost to 5.946 million RMB and the penalty cost to 5936 RMB, confirming that the historical trajectory information is necessary for capturing storage carry-over effects, ramping history, and lagged exogenous variations. When the multi-step rollout training is removed, the total cost also increases to 5.889 million RMB, indicating that one-step training alone is insufficient to match the recursive prediction mode used in MPC. Overall, these results demonstrate that the graph representation, temporal encoding, and multi-step rollout training jointly improve the dynamic representation quality of the world model and lead to more effective rolling dispatch decisions. In addition, the model size and rollout horizon remain moderate under the adopted settings in
Table 3, so the additional computational burden introduced by graph encoding and temporal rollout is acceptable for hourly rolling dispatch.
5.6. Role of Constraint-Aligned Training and Quantile Tightening
To quantify the practical contribution of constraint-aligned training and quantile-based safety tightening within the proposed framework, two degraded variants were further tested. The first variant, GraphWorldModel_MPC
w/
o physics-consistency, removes the balance-, storage-boundary-, and ramping-consistency terms from world model training. This case is used only as an ablation setting to evaluate the effect of constraint-aligned training, rather than as a practical dispatch configuration. The second variant, GraphWorldModel_MPC
w/
o quantile tightening, solves the MPC problem using the nominal state and power bounds without uncertainty-aware tightening. The comparison results are summarized in
Table 8.
Removing the physics-consistency regularization leads to a clear deterioration in both predictive quality and dispatch performance. As shown in
Table 8, the average 24-step NRMSE increases from 4.28% to 6.11%, while the monthly total cost and penalty cost rise from 5.758 million RMB and 2479 RMB to 5.868 million RMB and 8236 RMB, respectively. This indicates that the constraint-aligned training terms help guide the learned rollout trajectories toward the physical feasible manifold of the IEHES, thereby improving both prediction stability and downstream dispatch performance.
The comparison between GraphWorldModel_MPC w/o quantile tightening and the complete GraphWorldModel_MPC directly reflects the effect of non-tightened versus tightened MPC. Both settings use the same predictive model and economic objective, but the non-tightened variant uses nominal state and power bounds without prediction-error margins. Although its prediction accuracy remains almost unchanged, with the average 24-step NRMSE changing only from 4.28% to 4.31%, the penalty cost increases from 2479 RMB to 14,582 RMB and the dispatch violation rate rises from 0.00% to 0.97%. This confirms that quantile-based tightening mainly improves closed-loop feasibility by reserving executable safety margins inside the MPC feasible region, rather than by improving nominal prediction accuracy.
5.7. Comparison with Advanced Time-Series Prediction Baselines
To position the proposed predictor relative to advanced time-series forecasting methods, additional experiments were conducted under a unified predictor-to-MPC interface. The historical window, prediction horizon, input variables, and downstream hard-constrained MPC were kept identical for all methods, while only the internal predictor was replaced by representative time-series baselines, including LSTM, TCN, Transformer, and DLinear. In this way, the comparison focuses on the quality of the predictive model itself rather than differences in the downstream optimizer.
Table 9 reports the comparative results. The proposed GraphWorldModel_MPC achieved the lowest average 24-step NRMSE of 4.28% and the highest average 24-step
of 0.970 among all compared predictors. More importantly, the predictive advantage was consistently translated into downstream operational gains. Under the same MPC layer, the proposed method reduced the monthly total cost to 5,757,859 RMB, the penalty cost to 2479 RMB, and the grid import to 3060.29 MWh, while also yielding the lowest equivalent carbon emissions and zero reported dispatch violations in the evaluated month. By contrast, the sequence-only and general-purpose time-series baselines exhibited higher multi-step prediction errors, which led to increased cost, larger penalty terms, and weaker closed-loop feasibility.
These results indicate that the gain of the proposed framework does not come merely from using a nonlinear sequence predictor. Instead, it arises from the joint use of topology-aware spatial encoding, temporal rollout learning, and physics-consistency regularization. Such a design is particularly beneficial in the industrial-park IEHES, where the state evolution is jointly shaped by structured electricity–heat–hydrogen coupling and intertemporal storage dynamics.
Nevertheless, the results should be interpreted within the tested operating conditions. The performance gain of GraphWorldModel_MPC is most evident when the system exhibits strong cross-carrier coupling, storage carry-over effects, and non-negligible forecast deviations. Under simpler operating conditions with weak coupling or highly accurate forecasts, the advantage of graph-based rollout learning and quantile tightening may become less pronounced. In addition, the improved feasibility is obtained by narrowing selected state and power bounds, which may introduce moderate conservatism when prediction errors are overestimated. Therefore, the proposed framework is most suitable for IEHES dispatch scenarios where predictive accuracy, constraint feasibility, and uncertainty handling must be considered simultaneously.