Abstract
The rapid growth of Distributed Energy Resources (DERs) exerts significant pressure on distribution network margins, requiring predictive and safe coordination. This paper presents a closed-loop framework combining a topology-aware Spatio-Temporal Transformer (STT) for multi-horizon forecasting, a cooperative multi-agent reinforcement learning (MARL) controller under Centralized Training and Decentralized Execution (CTDE), and a real-time safety layer that enforces feeder limits via sensitivity-based quadratic programming. Evaluations on three SimBench feeders, with OLTC/capacitor hybrid control and a stress protocol amplifying peak demand and mid-day PV generation, show that the method reduces tail violations by 31% and 56% at the 99th percentile voltage deviation, and lowers branch overload rates by 71% and 90% compared to baselines. It mitigates tail violations and discrete switching while ensuring real-time feasibility and cost efficiency, outperforming rule-based, optimization, MPC, and learning baselines. Stress maps reveal robustness envelopes and identify MV–LV bottlenecks; ablation studies show that diffusion-based priors and coordination contribute to performance gains. The paper also provides convergence analysis and a suboptimality decomposition, offering a practical pathway to scalable, safe, and interpretable DER coordination.