Multi-Source Data Fusion-Based Grid-Level Load Forecasting

Ye, Hai; Teng, Xiaobi; Song, Bingbing; Zou, Kaiming; Zhu, Moyan; He, Guangyu

doi:10.3390/app15094820

Open AccessArticle

Multi-Source Data Fusion-Based Grid-Level Load Forecasting

by

Hai Ye

¹,

Xiaobi Teng

¹,

Bingbing Song

¹,

Kaiming Zou

^2,*,

Moyan Zhu

² and

Guangyu He

²

¹

East China Branch of State Grid Corporation of China, Pudian Road, Shanghai 200120, China

²

The Ministry of Education Key Laboratory of Control of Power Transmission and Conversion, Department of Electrical Engineering, Shanghai Jiao Tong University, Dongchuan Road, Minhang, Shanghai 200240, China

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2025, 15(9), 4820; https://doi.org/10.3390/app15094820

Submission received: 17 March 2025 / Revised: 18 April 2025 / Accepted: 21 April 2025 / Published: 26 April 2025

(This article belongs to the Special Issue State-of-the-Art of Power Systems)

Download

Browse Figures

Versions Notes

Abstract

This paper introduces a novel weighted fusion methodology for grid-level short-term load forecasting that addresses the critical limitations of direct aggregation methods currently used by regional dispatch centers. Traditional approaches accumulate provincial forecasts without considering regional heterogeneity in load characteristics, data quality, and forecasting capabilities. Our methodology implements a comprehensive evaluation index system that quantifies forecast trustworthiness through three key dimensions: forecast reliability, provincial impact, and forecasting complexity. The core innovation lies in our principal component analysis (PCA)-based weighted aggregation mechanism that dynamically adjusts provincial weights according to their evaluated reliability, further enhancing through time-varying weights that adapt to changing load patterns throughout the day. Experimental validation across three representative seasonal periods (moderate temperature, high temperature, and winter conditions) substantiates that our weighted fusion approach consistently outperforms direct aggregation, achieving a 24.67% improvement in overall MAPE (from 3.09% to 2.33%). Performance gains are particularly significant during critical peak periods, with up to 62.6% error reduction under high-temperature conditions. The methodology verifies remarkable adaptability across different temporal scales, seasonal variations, and regional characteristics, consistently maintaining superior performance from ultra-short-term (1 h) to medium-term (168 h) forecasting horizons. Analysis of provincial weight dynamics reveals intelligent redistribution of weights across seasons, with summer months characterized by Jiangsu dominance (0.30–0.35) shifting to increased Anhui contribution (0.30–0.35) during winter. Our approach provides grid dispatch centers with a computationally efficient solution for enhancing the integration of heterogeneous forecasts from diverse regions, leveraging the complementary strengths of individual provincial systems while supporting safer and more economical power system operations without requiring modifications to existing forecasting infrastructure.

Keywords:

short-term load forecasting; data fusion; principal component analysis; weighted aggregation; multi-regional integration

1. Background and Motivation

The rapid growth of electricity demand across national power grids has significantly outpaced forecasts in recent years, creating unprecedented challenges for grid operations. As illustrated by regional grid statistics, the Eastern China region—comprising Shanghai, Anhui, Zhejiang, and Jiangsu provinces—reached nearly 400 million kilowatts in peak load during summer 2023. Projections for 2024 indicate an additional 80 million kilowatt increase, representing a 5.8% growth rate. This accelerating demand, potentially exacerbated by extreme weather events, creates substantial pressure on power supply systems.

Simultaneously, China’s transition toward carbon neutrality has accelerated the integration of renewable energy sources, fundamentally altering the traditional power structure. By the end of 2025, the Eastern China power grid expects installed renewable capacity to reach 204.18 million kilowatts—a 94.8 million kilowatt increase from 2022, representing an annual growth rate of 23.4%. This dramatic shift toward cleaner energy introduces significant variability and uncertainty into grid operations.

In this evolving landscape, regional grid-level dispatching centers face three significant forecasting challenges that motivate our research:

First, grid-level forecasting typically relies on the direct accumulation of load forecasting data from provincial and municipal centers. However, significant regional disparities exist in economic development levels, industrial structures, and consumer behaviors, resulting in fundamentally different load patterns [1]. For instance, service-oriented Shanghai exhibits higher load levels during holidays and weekends, while manufacturing-intensive Jiangsu indicates more uniform electricity demand. Weather impacts also vary substantially across regions—coastal provinces experience different meteorological influences than inland areas. These differences make simple aggregation methods inadequate for accurate grid-level forecasting.

Second, heterogeneous data from diverse provincial sources create integration complexities that challenge existing systems. Each province employs different forecasting methodologies with varying algorithms, data formats, and update frequencies [2].

Third, provincial data reporting exhibits substantial time delays and quality variations. Current grid-level dispatch centers integrate provincial forecasts using simple aggregation methods that fail to account for these temporal inconsistencies [3]. This approach provides limited flexibility for adapting to the dynamic nature of modern power systems, particularly as renewable energy penetration increases [4].

In this challenging environment, accurate load forecasting remains essential for regional electricity market operations, safety verification, and demand response planning [5,6,7]. While individual provinces may achieve reasonable forecast accuracy within their boundaries, the integration of these forecasts at the grid level presents a distinct challenge that has received insufficient attention in existing research.

1.1. Literature Review and Research Gaps

Current short-term load forecasting approaches can be categorized into three main groups: statistical methods, data analysis techniques, and machine learning algorithms. While extensive research has advanced single-region forecasting capabilities, significant gaps remain in multi-regional integration methodologies, particularly for grid-level applications [8,9].

Statistical approaches represent the foundation of traditional load forecasting methods, valued for their mathematical interpretability and established performance records [10]. These methods primarily focus on extracting temporal trends from historical data to generate forecasts [11]. However, as power systems grow more complex with increasing renewable integration, these traditional statistical approaches face limitations in modeling highly nonlinear relationships and accommodating multiple influential factors simultaneously [12].

Data analysis methods focus on transforming input load sequences to enhance feature extraction and improve prediction accuracy. Tai et al. [13] applied wavelet analysis to decompose load series into multiple frequency bands, modeling each component separately before reconstruction, which significantly improved forecast accuracy during transitional periods. Similarly, Zhang et al. [14] utilized complementary ensemble empirical mode decomposition to extract frequency features from similar-day data, effectively capturing both long-term trends and short-term fluctuations. More recent approaches have employed variational mode decomposition to reduce load sequence volatility and facilitate feature extraction [15]. Zhang et al. [16] integrated this technique with convolutional neural networks (CNNs) and long short-term memory (LSTM) networks to forecast agricultural greenhouse load consumption with improved accuracy. Zhu et al. [17] developed a comprehensive clustering method for characterizing user load profiles and assessing their regulation potential in demand response applications. These data analysis approaches substantiate that appropriate preprocessing and feature enhancement can significantly improve forecasting performance, particularly when dealing with complex multi-source data typical in grid-level applications [18]. However, most existing studies focus on single-region applications, with limited exploration of how these techniques can be adapted for multi-regional integration challenges.

Machine learning algorithms have revolutionized load forecasting by offering powerful tools for modeling complex nonlinear relationships without requiring explicit mathematical formulations [19]. These approaches have evolved from traditional machine learning methods to sophisticated deep learning architectures specifically designed for temporal sequence modeling.

Neural network approaches, particularly deep learning variants, have shown remarkable progress in recent years. Recurrent neural networks (RNNs) were among the first deep architectures applied to time-series forecasting [20]. Later, long short-term memory (LSTM) networks emerged as a superior alternative by addressing the vanishing gradient problem inherent in standard RNNs [21]. Semmelmann et al. [22] combined LSTM with extreme gradient boosting (XGBoost) to create a hybrid prediction model that achieved higher accuracy than either method alone.

The Transformer architecture, originally developed for natural language processing by Vaswani et al. [23], has recently been adapted for time-series forecasting with notable success. Transformers employ self-attention mechanisms to evaluate sequence features, enabling them to better capture long-range dependencies in temporal data [24]. Wang et al. [25] utilized a self-attention encoder combined with deep neural networks for short-term net load forecasting, demonstrating excellence in both deterministic and interval predictions.

Recent innovations include the integration of Transformers with other architectures to enhance performance. Zhang et al. [26] proposed a Transformer-CNN fusion model that combines CNN’s multi-feature fusion capabilities with Transformer’s temporal modeling strengths, significantly improving prediction accuracy and multi-trend sequence feature extraction. Meng et al. [27] combined Transformers with spatiotemporal graph convolutional networks for short-term load forecasting in integrated energy systems, achieving high prediction accuracy and stability.

Data fusion approaches are especially relevant for grid-level forecasting, as they provide systematic frameworks for integrating heterogeneous information from multiple sources [28].

Probabilistic methods like Monte Carlo simulation have been widely used for analyzing uncertainty in power systems. Wang et al. [29] employed Monte Carlo techniques to simulate traveler behavior and assess electric vehicle charging requirements.

Artificial intelligence-based fusion approaches have gained prominence for their ability to discover complex relationships in heterogeneous data. Fan et al. [30] developed a multi-source data-and-model fusion method for ultra-short-term bus load forecasting that extracts feature vectors from numerical and non-numerical data using backpropagation artificial neural networks, combining these with image vectors from CNNs for improved prediction.

Some researchers have adopted modular approaches that first complete target perception, then perform association fusion, and finally generate comprehensive decision results [31]. Lu et al. [32] proposed an epidemic-informed load forecasting model based on extreme gradient boosting that incorporates social impact factors, demonstrating short training times and low deviation rates during the COVID-19 pandemic.

At the output level, fusion methods can combine multiple forecast results to improve overall accuracy. Long et al. [33] introduced a fine-grained data fusion approach for load forecasting that generates hourly predictions using multivariate linear regression, derives load scenarios based on independent variable uncertainties, and aggregates these from bottom to top to produce monthly forecasts. Yang et al. [34] investigated the effects of heavy traffic on steel slag asphalt mixtures, finding that high-temperature stability and skid resistance performance could be optimized with specific steel slag content ratios, providing insights into how similar optimization principles might be applied to multi-regional load forecasting integration.

Research Gaps in Grid-Level Forecasting Integration

Despite extensive research in various forecasting methodologies, several critical gaps remain in grid-level load forecast integration.

Most research focuses on single-region forecasting techniques, with limited exploration of multi-source data integration strategies essential for grid-level applications [35]. The unique challenges of combining forecasts from regions with heterogeneous characteristics remain insufficiently addressed.

Current multi-regional approaches typically employ simple accumulation methods that fail to address data heterogeneity and quality variations across provinces, treating all regional forecasts equally without considering their relative reliability.

Few studies have developed systematic evaluation frameworks for quantifying forecast reliability across regions with varying characteristics, making it difficult to objectively assess the trustworthiness of individual forecasts.

Existing research lacks comprehensive weighting methodologies that dynamically adjust the importance of regional forecasts based on quantifiable metrics and adapt to changing conditions [36].

These gaps highlight the need for a more sophisticated approach to grid-level forecast integration that systematically evaluates and dynamically combines regional forecasts. Our proposed weighted fusion methodology addresses these challenges by introducing a comprehensive evaluation framework and PCA-based weighting mechanism specifically designed for grid-level applications.

1.2. Research Contributions

To address these gaps, this paper makes the following contributions:

(1) We propose a comprehensive evaluation index system that quantifies the trustworthiness of forecasts from different provinces by considering three key dimensions: forecast history reliability, regional impact significance, and forecasting complexity.

(2) We develop a novel data fusion methodology based on principal component analysis (PCA) that dynamically weights provincial forecasts according to their evaluated reliability and significance, effectively addressing the heterogeneity challenge in multi-source grid-level forecasting.

(3) We implement and validate a complete weighted fusion framework that integrates with existing forecasting systems without requiring provincial centers to modify their current methodologies.

(4) We corroborate through case studies that our weighted fusion approach significantly outperforms direct accumulation methods, achieving a 2.33% average relative error in multi-regional forecasting.

1.3. Paper Organization

The remainder of this paper is organized as follows: Section 2 discusses the theoretical framework and methodologies employed, including our multi-criteria evaluation system and PCA-based fusion algorithm. Section 3 details the implementation steps of our forecasting framework, covering data preprocessing, feature engineering, model integration, and forecast fusion. Section 4 presents experimental results and comparative analyses across multiple regions and weighting scenarios. Finally, Section 5 concludes this paper with key findings and directions for future research.

Our research provides grid dispatch centers with a practical solution for enhancing forecast accuracy while effectively managing the challenges of multi-source data heterogeneity, ultimately supporting safer and more economical power system operations.

2. Materials and Methods

2.1. Challenges of Grid-Level Power Load Forecasting Integration

Grid-level load forecasting presents unique integration challenges beyond those encountered in single-region forecasting. The primary challenge stems from multi-source data heterogeneity across diverse provinces and cities. Regional dispatch centers must contend with forecasts generated using different methodologies, data formats, and update frequencies, making direct aggregation problematic.

Economic development disparities and varying resource investments across provinces result in significant differences in forecasting systems and capabilities. These differences manifest in non-uniform data collection methods, inconsistent data structures, variable update frequencies, and transmission delays. Such heterogeneity necessitates sophisticated data alignment and integration techniques that can account for these variations while preserving forecast accuracy.

Additionally, effective grid-level forecasting requires comprehensive understanding of the distinct load characteristics across provinces and their relative impact on the overall grid. A robust integration framework must not only handle complex multi-source data but also adaptively weight provincial forecasts based on their reliability, significance, and forecasting complexity.

2.2. Provincial Forecasting Methodologies

Provincial dispatch centers employ a variety of forecasting methodologies depending on their specific needs, data availability, and technical capabilities. These methods generally fall into three categories:

Traditional statistical models like ARIMA and Holt-Winters capture temporal patterns through autoregression, differencing, and seasonal components. While effective for stable load patterns, they struggle with complex nonlinear relationships and extreme weather conditions.

Machine learning approaches, including LSTM and GRU networks, offer improved capabilities for modeling nonlinear relationships and long-term dependencies in load data. These models can better capture complex seasonality and special event effects but require significant computing resources.

Advanced attention-based architectures like Transformer variants provide enhanced feature extraction capabilities by modeling relationships between multiple variables simultaneously. These models excel in handling complex forecasting scenarios with both long-term trends and short-term fluctuations.

Rather than focusing on any specific model, our research addresses the critical challenge of integrating these diverse provincial forecasts through a systematic evaluation and weighted fusion methodology.

2.3. Comprehensive Evaluation Index System

The foundation of our weighted fusion approach is a hierarchical evaluation index system that quantifies the reliability and significance of provincial forecasts. This system employs a three-level structure with three primary indicators: forecast reliability, provincial load impact on the main grid, and provincial forecasting complexity. Figure 1 illustrates the steps for constructing this evaluation system.

The primary purpose of constructing this evaluation index system is to quantitatively assess the reliability of load forecasting results from different provinces, providing a basis for determining weights in forecast fusion. The three-level structure is detailed in Table 1.

Key metrics within this system include the following:

1. Data Completeness Rate (DCR):

D C R_{t} = \frac{N_{v a l i d}}{N_{t o t a l}} \times 100 %

(1)

where

N_{v a l i d}

is determined by multiple validation functions including basic validity, anomaly detection, time alignment, and rate change checks:

V a l i d (x) = V a l i d_{b a s i c} (x) \land V a l i d_{a n o m a l y} (x) \land V a l i d_{t i m e} (x) \land V a l i d_{r a t e} (x)

(2)

2. Data Timeliness (DT):

D T_{t} = max (0, 1 - \frac{Δ t}{τ})

(3)

where

Δ t

is the actual data delay time, and

τ

is the maximum allowable delay threshold.

3. Data Consistency (DC):

D C = D C_{f o r m a t} \times D C_{l o g i c}

(4)

where format consistency and logical consistency are assessed separately:

D C_{f o r m a t} = 1 - \frac{N_{f o r m a t} + N_{s t r u c t u r e} + N_{s a m p l i n g}}{N_{t o t a l}}

(5)

D C_{l o g i c} = \frac{1}{n} \sum_{i = 1}^{n} exp (- \frac{| x_{i} - μ_{i} |}{σ_{i}})

(6)

The entropy weight method is used to calculate indicator weights and aggregate them hierarchically. For the k-th secondary indicator with

m_{k}

tertiary indicators and n evaluation objects (provinces), the calculation process is as follows:

1. Standardize tertiary indicators:

y_{i j} = \frac{x_{i j} - {min}_{i} x_{i j}}{{max}_{i} x_{i j} - {min}_{i} x_{i j}}

(7)

2. Calculate entropy value for each tertiary indicator:

e_{j} = - \frac{1}{ln n} \sum_{i = 1}^{n} p_{i j} ln p_{i j}

(8)

where

p_{i j} = \frac{y_{i j}}{\sum_{i = 1}^{n} y_{i j}}

(9)

3. Calculate tertiary indicator weights:

w_{j} = \frac{1 - e_{j}}{\sum_{j = 1}^{m_{k}} (1 - e_{j})}

(10)

4. Calculate secondary indicator scores:

S_{i, k} = \sum_{j = 1}^{m_{k}} w_{j} y_{i j}

(11)

5. Apply entropy method to secondary indicators to obtain primary indicator scores:

S_{i} = \sum_{k = 1}^{K} w_{k} S_{i, k}

(12)

2.4. PCA-Based Weight Determination Framework

Our data fusion methodology employs principal component analysis (PCA) to determine optimal reliability weights for provincial forecasts. This approach effectively preserves information while accommodating the multi-source heterogeneity of load forecasting data.

2.4.1. Evaluation Matrix Construction

For n provinces, T time points (e.g., 96 points at 15 min resolution or 24 points at hourly resolution), and 3 primary indicator scores for each province at each time point, we construct an evaluation matrix:

X_{t} = [\begin{matrix} x_{11, t} & x_{12, t} & x_{13, t} \\ x_{21, t} & x_{22, t} & x_{23, t} \\ ⋮ & ⋮ & ⋮ \\ x_{n 1, t} & x_{n 2, t} & x_{n 3, t} \end{matrix}], t \in {1, 2, \dots, T}

(13)

where

x_{i j, t}

is the score of the j-th primary indicator for the i-th province at time point t.

2.4.2. PCA-Based Weight Calculation Process

The PCA-based weight calculation follows five key steps [37]:

1. Standardize the matrix

X_{t}

for each time point:

Z_{t} = (X_{t} - μ_{t}) / σ_{t}

(14)

2. Calculate the correlation coefficient matrix:

R_{t} = \frac{1}{n} Z_{t}^{T} Z_{t}

(15)

3. Solve the eigenvalue equation:

| R_{t} - λ_{t} I | = 0

(16)

4. Calculate provincial weights based on eigenvalues and eigenvectors:

w_{i, t} = \frac{\sum_{j = 1}^{3} λ_{j, t} e_{i j, t}}{\sum_{j = 1}^{3} λ_{j, t}}

(17)

5. Apply temporal smoothing to ensure stability:

{\hat{w}}_{i, t} = θ w_{i, t} + (1 - θ) \frac{1}{k} \sum_{j = t - k}^{t - 1} w_{i, j}

(18)

where

θ

is a smoothing coefficient (typically 0.6–0.8) and k is the smoothing window length. The constraint

{\hat{w}}_{i, t} \leq w_{i}^{s e t} = α \cdot s_{i}^{f i n}

is applied to ensure no single province dominates the forecast, where

s_{i}^{f i n}

represents the final score for the i-th province, and

α

is a coefficient for the final score (set at 1.2).

2.4.3. Time-Varying Weight Matrix

This approach yields a time-varying weight matrix that reflects the varying prediction capabilities of each province at different periods while ensuring prediction continuity and stability through temporal smoothing:

W = [\begin{matrix} {\hat{w}}_{1, 1} & {\hat{w}}_{1, 2} & \dots & {\hat{w}}_{1, T} \\ {\hat{w}}_{2, 1} & {\hat{w}}_{2, 2} & \dots & {\hat{w}}_{2, T} \\ ⋮ & ⋮ & ⋱ & ⋮ \\ {\hat{w}}_{n, 1} & {\hat{w}}_{n, 2} & \dots & {\hat{w}}_{n, T} \end{matrix}]

(19)

2.4.4. Balanced Fusion Concept

These PCA-derived weights form the foundation of our balanced fusion approach that integrates both provincial forecasts and main grid forecasts. This dual-source integration methodology, detailed in Section 3, mitigates potential error accumulation from the evaluation system and leverages complementary forecasting strengths.

By dynamically balancing provincial and main grid forecasts based on reliability scores, our approach achieves superior accuracy compared to direct aggregation methods. The complete mathematical formulation and implementation details of this balanced fusion methodology are presented in Section 3.4.

2.5. Implementation Framework for Multi-Regional Load Forecasting Integration

Regional grid dispatch centers face significant challenges when attempting to integrate forecasting data from multiple provinces. Direct superimposition of provincial forecasts introduces substantial errors, as it fails to account for inter-regional interactions and data quality variations. Our comprehensive implementation framework for multi-regional forecast integration consists of four key components, presented below.

2.5.1. Multi-Regional Integration Framework Overview

Our integration framework consists of four key components: (1) data collection and validation, (2) multi-source data preprocessing, (3) provincial forecast generation, and (4) weighted forecast fusion. Figure 2 illustrates this integrated approach, which preserves existing provincial forecasting systems while introducing a systematic evaluation and fusion methodology at the grid level.

The framework addresses data heterogeneity challenges through standardized preprocessing protocols and accommodates temporal inconsistencies through dynamic time-delay compensation. Most importantly, it implements our evaluation index system to quantify provincial forecast reliability, which then informs the PCA-based weighting algorithm for optimal integration.

2.5.2. Multi-Source Data Preprocessing

Effective data preprocessing is critical for multi-regional integration due to the inherent heterogeneity across provincial data sources. Our preprocessing pipeline addresses four key challenges: data quality assessment, anomaly detection, temporal alignment, and standardization.

Beyond standard electricity consumption data, our framework incorporates the following:

Sectoral electricity usage data (industrial, commercial, residential).
Real-time electricity price information.
Operational data from distributed generation sources.
High-resolution meteorological data (temperature, humidity, precipitation).
Socioeconomic indicators (holidays, population density, economic indices).

We implement a dual-stage anomaly detection approach with an isolation forest algorithm for outlier detection and Least Squares Support Vector Regression (LS-SVR) for missing value imputation and smoothing. Feature selection is performed using a two-stage approach with Pearson correlation coefficient for initial screening and Elastic Net regression for refined selection. Z-score standardization is used for feature scaling.

2.5.3. Provincial Forecast Generation

Our framework preserves the autonomy of provincial forecasting systems while implementing standardized interfaces for forecast collection and evaluation. Rather than imposing a single forecasting methodology, our framework accommodates diverse provincial approaches based on regional characteristics:

Regions with stable industrial profiles: Statistical methods like ARIMA or exponential smoothing.
Regions with complex seasonal patterns: Recurrent neural networks (LSTM, GRU).
Regions with multiple influential external factors: Attention-based architectures.

2.5.4. Multi-Regional Forecast Fusion

The core innovation of our framework lies in the weighted fusion methodology for integrating provincial forecasts. Provincial forecasts are collected through standardized interfaces, with automated validation to ensure consistency. We calculate the three primary evaluation indices for each province and timestamp, and determine optimal provincial weights for each forecasting period using the PCA-based approach.

For each time point t, we calculate a dynamic adjustment coefficient

α_{t}

:

α_{t} = β \sum_{i = 1}^{n} (a c c u r a c y_{i, t} \cdot w_{i, t})

(20)

The final integrated forecast for time t is calculated as

L o a d_{f i n a l} = n_{p r o v i n c e} \cdot \sum_{i = 1}^{n} \frac{α \cdot w_{i, t} \cdot L o a d_{p r o v i n c e, i, t} + (1 - α) \cdot (1 - w_{i, t}) \cdot L o a d_{m a i n, i, t}}{α \cdot w_{i, t} + (1 - α) \cdot (1 - w_{i, t})}

(21)

Our framework is implemented as an end-to-end pipeline that operates in both day-ahead and real-time forecasting modes, ensuring that the grid dispatch center benefits from both the specialized knowledge embedded in provincial forecasting systems and the systematic integration methodology.

3. Results

3.1. Experimental Setup

3.1.1. Hardware and Software Configuration

The computational experiments were conducted on a system with Intel (Shanghai, China) Core i7-12600K CPU, ASUS (Guangzhou, China) Dual GeForce RTX 4060 GPU, and 32 GB RAM. The software environment consisted of Python 3.10, PyTorch 2.0.0, CUDA 11.8, and cuDNN 8.10. This configuration provided sufficient computational capacity for the training and evaluation of our proposed peak-weighted loss models across multiple regions.

3.1.2. Dataset Description

Our experimental evaluation utilized load data from the Eastern China power grid, encompassing five administrative regions: Shanghai, Anhui, Zhejiang, Jiangsu, and Fujian. The original load data underwent normalization mapping to preserve the characteristic patterns while ensuring data privacy. The dataset exhibited diverse regional characteristics, with Shanghai showing pronounced commercial and residential load patterns, while the other provinces demonstrated varying degrees of industrial and agricultural influence on their load profiles.

3.1.3. Methodological Approach

We implemented a rolling forecast methodology wherein models were retrained monthly and used to predict the subsequent month’s load. This approach resulted in 11 distinct training–prediction cycles, with the final cycle using January through November data to forecast December. For each region, we conducted two parallel forecasting experiments:

Provincial-level forecasting: Using historical load data without additional features but employing a peak-weighted loss function to improve accuracy during high-demand periods.
Grid-level integrated forecasting: Incorporating enhanced feature sets (including weather data, calendar effects, and regional economic indicators) alongside peak-weighted loss functions.

This dual approach allowed us to evaluate the efficacy of both feature engineering and specialized loss functions in improving forecast accuracy.

3.1.4. Evaluation Metrics

We employed five standard metrics to evaluate forecasting performance:

Mean Absolute Error (MAE): $\frac{1}{n} \sum_{i = 1}^{n} | y_{i} - \hat{y} i |$ .
Mean Absolute Percentage Error (MAPE): $\frac{100 %}{n} \sum {i = 1}^{n} | \frac{y_{i} - \hat{y} i}{y_{i}} |$ .
Root Mean Square Error (RMSE): $\sqrt{\frac{1}{n} \sum_{i = 1}^{n} {(y_{i} - \hat{y} i)}^{2}}$ .
Mean Square Percentage Error (MSPE): $\frac{100 %}{n} \sum_{i = 1}^{n} {(\frac{y_{i} - \hat{y} i}{y_{i}})}^{2}$ .
Coefficient of Determination ( $R^{2}$ ): $1 - \frac{\sum_{i = 1}^{n} {(y_{i} - \hat{y} i)}^{2}}{\sum_{i = 1}^{n} {(y_{i} - \bar{y})}^{2}}$ .

3.2. Regional Forecasting Performance Analysis

3.2.1. Daily Load Pattern Analysis for Shanghai

Figure 3 presents a comparative visualization of predicted versus actual load for Shanghai on 1 July 2024. Several noteworthy characteristics and issues can be observed:

Overnight Load Overestimation: Between 00:00 and 03:00, the model systematically overestimated the actual load, suggesting potential changes in overnight energy consumption patterns not captured by historical training data.
Morning Valley Discrepancy: A significant forecasting error appears during the early morning hours (04:00–06:00), where the actual load reached approximately 150,000 units while the prediction remained at around 190,000 units. This 26.7% difference represents the largest proportional error in the 24 h cycle and indicates a potential structural change in morning industrial or commercial activity.
Midday and Afternoon Peak Accuracy: Between 09:00 and 15:00, the model affirms good agreement with the actual peak load pattern but consistently underestimates the magnitude by approximately 25,000–30,000 units (6.5–7.0%). This systematic bias suggests the need for the recalibration of the peak-weighted loss function.
Evening Transition Period: From 18:00 to 22:00, while the model correctly captures the declining trend, it again underestimates the actual load by a relatively consistent margin of approximately 20,000 units (5.0–5.5%).

3.2.2. Monthly Rolling Forecast Performance

Figure 4 extends our analysis to the entire month of July 2024, revealing several important temporal patterns:

Daily Cyclical Pattern Capture: The forecasting model successfully captures the fundamental daily load cycles with clear distinction between weekday and weekend patterns.
Peak Load Magnitude Error: Consistent with the single-day analysis, the model systematically underestimates peak loads throughout the month, with discrepancies reaching up to 10% during extreme demand periods (particularly evident on 8–9 July and 17–19 July).
Valley Forecasting Variability: The model demonstrates inconsistent performance in predicting load valleys, sometimes overestimating (1–3 July) and other times underestimating (13–15 July) the minimum load values. This variability suggests potential instability in the model’s response to low-load conditions.
Mid-Month Adaptation Failure: An interesting phenomenon occurs around 12–15 July, where the forecast temporarily improves before deteriorating again. This suggests that while the model may be capturing some temporal shifts in load patterns, it fails to maintain this adaptation consistently throughout the forecast horizon.
Weekday–Weekend Differentiation: The model demonstrates better performance on weekdays than weekends, with weekend load overestimation being particularly problematic. This indicates insufficient differentiation of calendar features in the provincial-level model.

These observations highlight several challenges in provincial-level load forecasting that merit further investigation. The systematic underestimation of peak loads suggests that despite implementing a peak-weighted loss function, the model may require additional calibration or feature engineering to adequately capture extreme demand events. The inconsistent performance across different times of day and days of the week indicates potential opportunities for temporal ensemble methods or more sophisticated calendar-based feature extraction.

The analysis of Shanghai’s forecasting performance exemplifies the challenges faced by regional dispatch centers in maintaining forecast accuracy throughout varying temporal conditions and load regimes. These findings provide a foundation for evaluating the potential advantages of grid-level integrated forecasting approaches discussed in subsequent sections.

3.3. Evaluation of Weighted Fusion Methodology

To validate the efficacy of our proposed weighted fusion methodology, we conducted comprehensive evaluations under diverse operational scenarios. This section presents experimental results comparing the performance of our PCA-based weighted fusion approach against the direct aggregation of provincial forecasts across three representative seasonal periods: June (moderate temperature conditions), August (high temperature conditions), and December (winter conditions).

3.3.1. Experimental Methodology

For each evaluation period, we generated forecasts using the rolling monthly training approach described in Section 3.1.3. The comparison focused on three forecast methodologies:

Actual Load: The recorded ground truth values for grid-level load.
Direct Aggregation: Simple summation of provincial-level forecasts without weighting adjustments.
Weighted Fusion: Our proposed PCA-based integration methodology that dynamically assigns reliability-based weights to provincial forecasts.

The evaluation metrics included time-series visualization, hourly MAPE distribution, load scatter plots with differentiated peak/non-peak periods, and provincial evaluation radar charts illustrating the three primary indicators: forecast reliability, provincial impact, and forecasting complexity.

From the visualization in Figure 5, we can observe a comprehensive performance analysis of our forecasting methodologies across different training models. It is important to note that the Model ID represents the month used for training, where Model ID 01 indicates that January data were used to train the model, with results showing its performance in predicting February’s load, and so forth in a sequential pattern.

The top panel illustrates the overall MAPE comparison between weighted fusion and direct aggregation approaches. A striking observation is the significantly higher error rates for both methodologies when using January data to predict February load (Model ID 01), with Direct Aggregation reaching approximately 80% MAPE and weighted fusion showing about 60%. This pronounced error likely stems from the distinct load patterns between winter (January) and transition months (February), where temperature variations and seasonal consumption changes are most dramatic. As we progress through models trained on subsequent months, the error rates stabilize considerably, with weighted fusion consistently outperforming direct aggregation across all models.

The middle panel differentiates between peak hour and non-peak hour performance, specifically for the weighted fusion methodology. Notably, non-peak hours show substantially higher error rates than peak hours for Model ID 01, approaching 80% versus 53%, respectively. This discrepancy suggests that our methodology better captures consumption patterns during high-demand periods than during transitional or low-demand periods when using January data. The performance gap between peak and non-peak hours narrows significantly for subsequent months, indicating more stable and consistent prediction capabilities as seasonal patterns become more established.

The bottom panel quantifies the percentage improvement that weighted fusion achieves over direct aggregation, broken down by overall performance and specific time periods. The improvement ranges from approximately 20% to 65%, with Models 07, 10, and 11 demonstrating the most substantial enhancements exceeding 60%. This indicates that our weighted fusion approach is particularly effective when training with data from July, October, and November to predict the subsequent months. The improvement is generally more pronounced for peak hours, underscoring our methodology’s strength in capturing critical high-demand periods that are typically of greatest concern to grid operators.

These results confirm that the weighted fusion methodology substantially outperforms direct aggregation, with the magnitude of improvement varying based on the seasonal characteristics of the training data. The method illustrates remarkable adaptability across different months, with particular strength in properly weighting provincial contributions during challenging transitional seasons and peak demand periods.

3.3.2. Moderate Condition Analysis (June 2024)

The forecasting performance for June 2024 and a period characterized by moderate weather conditions is illustrated in Figure 6. As shown in Figure 6, the weighted fusion methodology consistently outperforms direct aggregation, particularly during the second half of the month (13–29 June) where peak load forecasting becomes increasingly challenging. The weighted fusion approach successfully captures both the magnitude and temporal patterns of load fluctuations, demonstrating significantly lower prediction errors during high-load periods.

Figure 7 elucidates hourly analysis of load forecasting performance during June. The top panel reveals that direct aggregation consistently underestimates load during peak hours (8:00–16:00), with an average underestimation of approximately 7.3%. In contrast, the weighted fusion approach reduces this underestimation to approximately 3.2%. The bottom panel quantifies this improvement through hourly MAPE values, with weighted fusion reducing peak-hour errors by 4.8–6.1 percentage points in contrast with direct aggregation.

The scatter plot analysis in Figure 8 further confirms these observations, with weighted fusion achieving superior coefficient of determination (

R^{2}

) values for both peak hours (0.9204 vs. 0.7183) and non-peak hours (0.9453 vs. 0.8349). This indicates that weighted fusion provides more consistent and reliable forecasts across all operational conditions.

3.3.3. High-Temperature Condition Analysis (August 2024)

August 2024 presented significant forecasting challenges due to sustained high temperatures across the Eastern China grid region. Figure 9 renders visible that these extreme weather conditions substantially amplified forecasting errors, particularly for direct aggregation, which consistently underestimated peak loads by 15–20%. This underestimation is especially pronounced during the first half of August (1–15 August), where daily peak loads frequently exceeded 14 million MW.

The weighted fusion methodology maintained significantly better performance under these extreme conditions, with peak-hour MAPE values averaging 6.1% when juxtaposed with 16.3% for direct aggregation in Figure 10. This represents a 10.2 percentage point improvement during critical high-demand periods. The scatter plot analysis in Figure 11 reveals that direct aggregation actually produced negative

R^{2}

values (−0.0197) for peak hours, indicating complete forecasting failure, while weighted fusion maintained a strong

R^{2}

of 0.8636.

These results validate that our weighted fusion approach is particularly valuable during extreme weather conditions when forecasting accuracy is most critical for grid stability and economic operation.

3.3.4. Winter Condition Analysis (December 2024)

The December 2024 evaluation presents a distinct operational scenario with winter heating loads significantly impacting consumption patterns. Figure 12 portrays that direct aggregation consistently overestimated load during this period, particularly during early morning hours (3:00–7:00). This overestimation is quantified in Figure 13, which shows MAPE values for direct aggregation exceeding 7% during these hours.

The weighted fusion approach substantially mitigated this overestimation problem, maintaining MAPE values below 2.7% even during challenging early morning periods. As depicted in Figure 14, weighted fusion achieved exceptional performance during December, with

R^{2}

values of 0.9819 for peak hours and 0.9726 for non-peak hours. This represents the best performance among all three evaluation periods, confirming the adaptability of our methodology across diverse seasonal conditions.

3.3.5. Provincial Evaluation Analysis

Figure 15 delineates radar charts, visualizing the three primary evaluation indicators for each province across the three assessment periods. Several important patterns emerge from this analysis:

Forecast Reliability: Jiangsu consistently demonstrated the highest forecast reliability across all three periods, followed by Zhejiang. Shanghai showed the lowest reliability in June and August but improved significantly in December.
Provincial Impact: Jiangsu maintained the highest provincial impact on the overall grid due to its substantial industrial load. This impact remained relatively stable across seasons, whereas other provinces showed more significant seasonal variations.
Forecasting Complexity: Anhui consistently presented the lowest forecasting complexity, while Shanghai demonstrated the highest complexity during summer months, reflecting its sensitivity to cooling demand in its predominantly urban environment.

These evaluation patterns directly informed the time-varying weights assigned by our PCA-based methodology, allowing the system to adaptively emphasize more reliable provincial forecasts while accounting for their relative impact on the overall grid.

3.3.6. Computational Efficiency Analysis

The operational feasibility of our weighted fusion methodology depends not only on its accuracy improvements but also on its computational efficiency. We evaluated the computational overhead of applying the PCA-based weighting algorithm in an operational setting with intra-day rolling forecasts. After model training and provincial forecast generation, the complete weighting and fusion process consistently completed within 20 s, with five experimental runs yielding processing times of 14.89 s, 15.16 s, 16.04 s, 15.59 s, and 15.19 s, averaging 15.37 s.

This processing time is well within the operational requirements for grid dispatch centers, which typically operate on 5–15 min decision cycles. The minimal computational overhead ensures that our methodology can be deployed in real-time operational environments without creating bottlenecks in the decision-making process.

3.3.7. Summary of Performance Improvements

Table 2 summarizes the average performance metrics across the three evaluation periods, clearly demonstrating the consistent superiority of weighted fusion over direct aggregation. The improvement is particularly pronounced during extreme weather conditions (August) and for peak hour predictions, precisely when accurate forecasting is most critical for grid operations.

These results convincingly establish that our PCA-based weighted fusion methodology significantly outperforms traditional direct aggregation approaches across diverse operational conditions. By dynamically incorporating provincial forecast reliability, impact, and complexity into the weighting scheme, our approach effectively leverages the complementary strengths of provincial forecasting systems while mitigating their individual weaknesses.

3.4. Multi-Temporal Scale Analysis

To further evaluate the adaptability and robustness of our weighted fusion methodology, we conducted an additional experiment analyzing prediction accuracy across multiple temporal scales. This experiment assessed the performance degradation of both weighted fusion and direct aggregation as the forecasting horizon extended from ultra-short-term (1 h) to medium-term (168 h/one week) predictions. Figure 13, Figure 14 and Figure 15 show visualizations of the MAPE results across five temporal scales (1, 6, 24, 72, and 168 h) for our three evaluation periods.

Figure 16a displays the moderate season (June) forecast accuracy across temporal scales. The weighted fusion methodology maintains significantly superior performance at all prediction horizons, with MAPE values ranging from 1.64% (1 h ahead) to 2.51% (168 h ahead). In contrast, direct aggregation exhibits substantially higher error rates, ranging from 5.18% to 5.82%. Interestingly, both methodologies demonstrate their lowest MAPE values at the 72 h horizon, with weighted fusion achieving 1.64% and direct aggregation achieving 4.58%. This unexpected improvement at the 72 h mark likely reflects the weekly cyclical patterns in the load data, where the 72 h forecast benefits from capturing a similar point in the weekly cycle.

Figure 16b presents performance under high-temperature conditions (August), where the advantage of weighted fusion becomes even more pronounced. The direct aggregation method exhibits extremely high MAPE values between 8.0 and 9.1% across all time horizons, with particularly poor performance at the 24 h and 72 h marks. The weighted fusion approach maintains substantially better performance, with MAPE values declining gradually from 2.94% at the 1 h horizon to 2.32% at the 168 h horizon. This counterintuitive improvement at longer horizons suggests that the weighted fusion methodology successfully captures long-term seasonal patterns that compensate for the loss of short-term accuracy.

Figure 16c demonstrates forecast performance during winter conditions (December), where both methods achieve their best overall accuracy. The weighted fusion approach delivers exceptional performance, with MAPE values below 1.0% for all horizons beyond 24 h. Direct aggregation shows substantial improvement as opposed to other seasons but still maintains MAPE values 2–3 times higher than the weighted fusion approach. The consistent pattern across all three evaluation periods is that the performance gap between weighted fusion and direct aggregation tends to be largest at shorter time horizons (1–24 h) and slightly narrows at extended forecasting horizons.

This multi-temporal scale analysis yields three significant insights:

Seasonal Differentiation: The performance advantage of weighted fusion is most pronounced during extreme temperature conditions (August), moderately significant during mild conditions (June), and least pronounced (though still substantial) during winter conditions (December).
Temporal Resilience: The weighted fusion methodology demonstrates remarkable resilience to extended forecasting horizons. While conventional forecasting methods typically show degrading performance as the prediction horizon extends, our approach maintains relatively stable MAPE values even at the 168 h horizon. This suggests that by effectively balancing provincial forecasts based on their demonstrated reliability, our methodology successfully preserves long-term forecasting accuracy.
Operational Implications: The superior performance of weighted fusion at very short horizons (1–6 h) has particularly significant operational implications for grid dispatch centers. These ultra-short-term forecasts directly inform real-time operational decisions, making the substantial accuracy improvements (approximately 60–70% error reduction in comparison with direct aggregation) critically valuable for maintaining grid stability and economic operation.

The consistent outperformance of weighted fusion across all temporal scales and seasonal conditions provides compelling evidence for the robustness and adaptability of our methodology. The fact that performance advantages persist from ultra-short-term to medium-term forecasting horizons confirms that the approach is not merely addressing a specific temporal niche but rather fundamentally improving the integration of multi-regional forecast information across diverse operational scenarios.

3.5. Relationship Between Evaluation Indicators and Forecasting Performance

Our third case study examines the relationship between evaluation indicators and forecasting performance, focusing on how regional forecasting characteristics influence the time-varying weights in our fusion methodology. This analysis provides deeper insights into the adaptability of our weighting approach across different operational scenarios and regional characteristics.

3.5.1. Comparative Analysis of Regional Evaluation Indicators

Figure 17 outlines the evaluation indicators for peak and non-peak periods, revealing distinct characteristics that influence weighting decisions in our fusion methodology. During peak periods, forecast reliability scores remain consistent with non-peak periods, indicating that while challenging, peak period forecasting maintains comparable reliability within our evaluation framework. However, provincial impact scores are noticeably lower during peak periods, reflecting the more complex interaction patterns between regional grids during high-demand periods.

The most significant differentiation appears in forecasting complexity, where peak periods substantiate substantially higher complexity scores. This increased complexity during peak periods is expected due to the greater volatility and sensitivity to external factors (e.g., temperature extremes, industrial production schedules) during high-demand periods. Despite these higher complexity scores, the final composite scores show only a moderate reduction during peak periods, demonstrating the balanced nature of our evaluation framework.

Figure 18 extends this analysis to a comparison between two representative regions with differing forecasting characteristics. Region 1 exhibits higher forecasting complexity but lower provincial impact as opposed to Region 2. Despite comparable forecast reliability scores, Region 2 achieves a significantly higher final score (0.58 vs. 0.52), primarily due to its substantially greater provincial impact on the overall grid. This underscores how our evaluation framework appropriately balances intrinsic forecasting quality with the practical significance of each region’s contribution to the integrated grid forecast.

3.5.2. Time-Varying Weight Dynamics

The dynamic adaptability of our weighting approach is demonstrated in Figure 19 and Figure 20, which track the evolution of provincial weights over time. Figure 19 reveals subtle but important differences between peak and non-peak weights for Shanghai over a 48 h period. While both weight profiles follow similar trajectories, peak weights consistently remain lower during daytime hours (08:00–18:00) and higher during early morning hours (00:00–06:00) in juxtaposition with non-peak weights.

This temporal pattern aligns with Shanghai’s urban load characteristics, where commercial and residential air conditioning creates more volatile and less predictable load patterns during daytime peak hours. During early morning hours, when industrial loads dominate, the forecasting reliability improves, resulting in higher weights. The convergence of peak and non-peak weights around 06:00 and 23:00 marks the transition periods between these distinct operational regimes.

Figure 20 compares the time-varying weights for Shanghai and Jiangsu over the same 48 h period, revealing significantly higher weights assigned to Jiangsu throughout the observation window. This substantial difference reflects Jiangsu’s stronger overall evaluation metrics, particularly its higher provincial impact score due to its large industrial base. The weight trajectories also demonstrate important temporal differences: while Shanghai’s weights remain relatively stable with minor fluctuations, Jiangsu exhibits a more pronounced upward trend, increasing from 0.257 to 0.282 over the 48 h period.

This divergence in weight evolution patterns reflects the different responses of each province to changing operational conditions. Jiangsu’s steadily increasing weight indicates improving forecast reliability as operational conditions stabilize, while Shanghai’s more stable weight profile suggests less sensitivity to these changing conditions. These distinct patterns validate the importance of our time-varying weighting approach, which captures not only the relative strengths of different provinces but also their dynamic responses to evolving grid conditions.

3.5.3. Seasonal Variations in Provincial Weight Dynamics

Figure 21, Figure 22 and Figure 23 present the time-varying provincial weights across June, August, and December, while Figure 4, Figure 5 and Figure 6 illustrate the corresponding load forecasting performance and MAPE distributions. These visualizations reveal several important seasonal patterns in weight dynamics and their impact on forecasting accuracy.

June (Moderate-Temperature Conditions)

In June, the weight distribution shows a clear hierarchical pattern, with Jiangsu receiving the highest weights (0.30–0.35), followed by Zhejiang (0.25–0.30), and substantially lower weights for Shanghai, Anhui, and Fujian (0.10–0.15). This weight distribution corresponds with moderate MAPE values (3–9%) during peak hours (8:00–16:00). The relatively stable weights assigned to each province throughout the day indicate consistent forecasting reliability under moderate-temperature conditions. Notably, the weighted fusion approach exhibits particular effectiveness during afternoon peak hours (12:00–16:00), where it reduces MAPE by approximately 6% compared to direct aggregation.

August (High-Temperature Conditions)

In August, we observe a similar hierarchical pattern in provincial weights but with increased temporal variations, particularly for Jiangsu and Zhejiang. Jiangsu’s weights show more pronounced fluctuations (0.28–0.37) throughout the day, while Shanghai’s weights remain consistently low. This period is characterized by significantly higher MAPE values for both methodologies (weighted fusion: 5–7%; direct aggregation: 15–18% during peak hours), reflecting the increased forecasting difficulty during extreme high temperature conditions when air conditioning loads create volatile consumption patterns. The substantial performance gap between weighted fusion and direct aggregation (approximately 10–12% MAPE reduction) demonstrates the critical importance of dynamic weight adjustments during challenging seasonal conditions.

December (Winter Conditions)

The December weight distribution reveals a notable shift, with significantly higher weights assigned to Anhui (0.30–0.35) when measured against summer months, while Jiangsu’s weights decrease substantially (0.20–0.25). This seasonal redistribution of weights reflects changing regional load characteristics during winter, where provinces with higher heating demands (such as Anhui) gain greater influence in the integrated forecast. Interestingly, the MAPE distribution during December shows a distinctive pattern, with higher errors occurring during early morning hours (3:00–6:00) rather than during daytime peaks. The weighted fusion approach achieves particularly impressive performance during these early morning hours, reducing MAPE by over 5% when examined against direct aggregation.

Cross-Seasonal Comparative Analysis

Comparing the three seasonal periods reveals important patterns in how provincial weights influence forecasting accuracy:

Seasonal Weight Redistribution: The substantial shift in weights from Jiangsu’s dominance in summer months to the increased Anhui contribution in winter demonstrates our methodology’s ability to recognize and adapt to seasonal changes in provincial load characteristics. This redistribution is essential for maintaining forecasting accuracy across different operational conditions.
Temporal Granularity Impact: The increased temporal variations in weights during August, particularly for Jiangsu and Zhejiang, coincide with higher overall MAPE values but also greater performance improvements from weighted fusion. This suggests that more dynamic weight adjustments are both necessary and effective during high-volatility periods.
Error Profile Transformation: The shift in error profiles from afternoon-peaked in summer months to morning-peaked in winter illustrates how seasonal load characteristics fundamentally alter forecasting challenges. The weighted fusion methodology successfully adapts to these changing error profiles by dynamically redistributing provincial weights.

These observations confirm that the superior performance of our weighted fusion approach stems from its ability to dynamically adjust to both seasonal variations in provincial load characteristics and temporal changes in forecasting reliability. By continuously rebalancing provincial contributions based on real-time evaluation metrics, the methodology effectively mitigates the forecasting challenges associated with different seasonal conditions, particularly during periods of high load volatility.

3.5.4. Implications for Multi-Regional Integration

The analysis of evaluation indicators and time-varying weights yields several important implications for multi-regional load forecasting integration:

Operational Regime Differentiation: The distinct differences in evaluation metrics between peak and non-peak periods justify our approach of calculating separate weights for different operational regimes rather than using a single static weight. This differentiation allows our fusion methodology to adapt to the changing characteristics and relative strengths of provincial forecasts throughout the day.
Provincial Specialization Recognition The substantial differences in weights between provinces (e.g., Shanghai and Jiangsu) demonstrate how our methodology effectively recognizes and leverages provincial specialization. Rather than treating all provincial forecasts equally, our approach appropriately emphasizes forecasts from provinces with stronger evaluation metrics, particularly during periods when their forecasting strengths are most relevant.
Temporal Adaptation: The evolution of weights over time shows how our methodology responds to changing operational conditions in real-time. This adaptive capability is particularly valuable during transition periods (e.g., morning ramp-up, evening peak) when load characteristics change rapidly and the relative strengths of different provincial forecasting systems may shift.

These case studies confirm that the superior performance of our weighted fusion approach is not merely a statistical artifact but rather emerges from the methodology’s ability to recognize and adapt to the complex relationships between regional forecasting characteristics, operational conditions, and temporal dynamics. By dynamically adjusting provincial weights based on a comprehensive evaluation framework, our approach successfully balances the diverse strengths and weaknesses of regional forecasting systems to achieve optimal integrated grid-level forecasts.

4. Discussion

Our comprehensive experimental evaluation demonstrates the significant advantages of the weighted fusion approach over traditional direct aggregation methods for grid-level load forecasting integration. Several key findings warrant detailed discussion to highlight the implications and significance of our methodology.

4.1. Performance Improvements in Critical Operational Periods

The performance gap between our weighted fusion methodology (2.33% MAPE) and direct aggregation (3.09% MAPE) is particularly pronounced during peak load periods and extreme weather conditions. This improvement is operationally significant because accurate peak load forecasting directly impacts grid reliability and operational safety. The traditional direct aggregation approach consistently underestimates peak loads by up to 7.5% during critical high-demand periods, which could lead to insufficient generation capacity and potential grid instability. In contrast, our weighted fusion approach maintains significantly better tracking of actual load patterns, with peak-hour MAPE values reduced by 62–69% across our evaluation periods.

This enhanced performance during critical operational periods is primarily attributable to the dynamic weight redistribution implemented by our PCA-based methodology. As demonstrated in the seasonal weight analysis, our approach effectively recognizes the changing forecasting strengths of different provinces under varying operational conditions, assigning higher weights to more reliable regional forecasts during each specific period. This adaptive weighting is particularly effective during extreme high temperature conditions (August), where weighted fusion reduced MAPE by approximately 10–12% versus direct aggregation.

4.2. Temporal Adaptability and Regional Specialization

A distinctive feature of our methodology is its ability to recognize and leverage regional specialization through time-varying weights. The heatmap visualizations of provincial weights reveal clear hierarchical patterns that align with the underlying strengths of each region’s forecasting systems. The substantial shift in weights from Jiangsu’s dominance in summer months (0.30–0.35) to increased Anhui contribution in winter (0.30–0.35) demonstrates the methodology’s responsiveness to seasonal changes in provincial load characteristics.

Moreover, the weight trajectories show important temporal variations throughout the day, with significant differences between peak and non-peak periods. Shanghai’s weights consistently remain lower during daytime hours (08:00–18:00) and higher during early morning hours (00:00–06:00), reflecting the changing reliability of its forecasts throughout the day. This temporal adaptability enables our methodology to effectively respond to the evolving operational conditions that characterize complex power systems, particularly those with high renewable energy penetration.

The multi-temporal scale analysis further confirms this adaptability, revealing that weighted fusion maintains superior performance across forecasting horizons ranging from ultra-short-term (1 h) to medium-term (168 h). The consistent outperformance across all temporal scales provides compelling evidence for the robustness of our methodology in diverse operational contexts.

4.3. Balanced Integration of Complementary Forecasting Strengths

The evaluation radar charts reveal that regions exhibit complementary strengths across our three primary indicators. While Jiangsu demonstrates exceptional performance in grid impact, Shanghai excels in managing forecasting complexity, and Anhui shows particular strengths in winter conditions. Our weighted fusion methodology effectively combines these complementary capabilities, creating an integrated forecast that leverages the specialized knowledge embedded in each provincial system.

This balanced integration approach is particularly valuable for regional grid operations that must coordinate across provinces with heterogeneous characteristics. Rather than imposing a single forecasting methodology across all regions—which would fail to account for local specialization—our approach preserves the autonomy of provincial forecasting systems while implementing a systematic framework for optimal integration. This preserves institutional knowledge while addressing the multi-source heterogeneity challenge that has limited previous integration efforts.

4.4. Comparison with Existing Fusion Approaches

A comparative analysis of our weighted fusion methodology with existing multi-source data fusion approaches reveals several distinctive contributions. While Yuan et al. [9] propose an edge intelligence-based framework for power distribution IoT that addresses data storage and computing performance through Box-Cox transform Z-score normalization and DS inference-based fusion, their approach primarily targets sensor-level data integration rather than forecast combination. Our methodology fundamentally differs by introducing a comprehensive evaluation framework specifically designed to assess forecast reliability in grid-level applications, where the integration unit is the forecasting result itself rather than raw measurement data.

Similarly, He et al. [36] focus on interoperability challenges of heterogeneous data from micro-synchrophasor measurement units, developing a unified information model and distributed processing technology for streaming data. In contrast, our approach advances beyond data standardization to address the quality assessment and optimal weighting of forecasts. The principal component analysis mechanism we employed for weight derivation represents a significant methodological advancement over both approaches, as it systematically captures the multidimensional reliability characteristics of regional forecasts and translates them into quantifiable contribution weights.

Furthermore, our incorporation of time-varying weights addresses a fundamental limitation in existing fusion approaches, which typically apply static weighting schemes regardless of temporal variations in forecast performance. This dynamic adaptability is particularly critical for grid-level load forecasting, where forecast reliability exhibits significant diurnal patterns. The empirical validation across six diverse regions demonstrating substantial performance improvements (24.67% for MAPE) provides compelling evidence for the effectiveness of our approach in addressing the specific challenges of regional grid-level load forecast integration.

4.5. Limitations and Future Work

Despite the significant improvements demonstrated by our methodology, several limitations should be acknowledged. First, while our approach dynamically adjusts weights based on evaluation metrics, it does not explicitly model inter-regional dependencies that might affect load patterns across adjacent areas. Incorporating spatial correlation analysis could potentially enhance the weighting mechanism, particularly for regions with strong economic or climatic connections.

Second, the current implementation uses a predefined set of evaluation indicators; exploring adaptive indicator selection based on specific grid characteristics could yield further improvements. Different grid systems may benefit from customized evaluation frameworks that emphasize indicators most relevant to their particular operational challenges.

Third, our evaluation is based on data from the Eastern China power grid, and the applicability to other geographic areas with different characteristics requires further investigation. Testing the methodology across more diverse grid systems would provide valuable insights into its generalizability and potential areas for refinement.

These limitations present several promising avenues for future research:

1. Integration of extreme weather event prediction models with load forecasting to enhance grid resilience: Combining advanced meteorological forecasting with our weighted fusion methodology could significantly improve prediction accuracy during critical weather events when grid stability is most challenged.

2. Application of explainable AI techniques to increase transparency in the fusion process: Enhancing the interpretability of weight determination would build operator trust and provide deeper insights into the factors driving regional forecast reliability variations.

3. Development of adaptive weighting schemes that can continuously learn and evolve based on performance feedback: Implementing reinforcement learning approaches could enable the system to autonomously refine weights as regional forecasting capabilities change over time.

4. Extension of the methodology to incorporate emerging data sources such as electric vehicle charging patterns and behind-the-meter generation: As distributed energy resources continue to grow, integrating these new data streams will become increasingly important for comprehensive load forecasting.

5. Integration with emerging edge computing architectures to enable more distributed fusion approaches: Leveraging edge intelligence for local processing could reduce communication latency and enhance system resilience, particularly relevant for large-scale implementation across extensive grid networks.

These future directions demonstrate the potential long-term impact of our methodology beyond its immediate application, providing a roadmap for continued advancement in grid-level forecast integration in increasingly complex and distributed power systems.

5. Conclusions

This paper addresses a critical challenge in grid-level load forecasting by introducing a comprehensive weighted fusion methodology that significantly outperforms traditional direct aggregation approaches. Our research makes several substantive contributions to the field of multi-regional power load forecasting integration, demonstrating both theoretical innovation and practical utility for grid operations.

5.1. Methodological Contributions

Our primary methodological contribution lies in the development of a theoretically grounded evaluation index system that quantifies regional forecast reliability through a hierarchical structure, encompassing forecast reliability, provincial impact, and forecasting complexity. This multidimensional framework effectively captures the heterogeneity across different forecasting systems and provides a mathematical foundation for optimal weight determination that previous approaches lacked.

The PCA-based weighted fusion algorithm introduces a novel approach to integrating multi-source forecasting data, systematically determining optimal weights based on the evaluated trustworthiness of each provincial forecast. Our time-varying weighting mechanism further enhances this approach by adapting to temporal load patterns and changing regional forecasting capabilities throughout the day. This dynamic adaptability represents a significant advancement over existing approaches in the literature, which typically employ static weighting schemes.

5.2. Empirical Findings

Our comprehensive experimental evaluation across three representative seasonal periods demonstrates the substantial performance advantages of weighted fusion over direct aggregation. The methodology achieved a 24.67% overall improvement in MAPE (from 3.09% to 2.33%), with particularly significant enhancements during critical peak periods and extreme weather conditions. Under high-temperature conditions in August, weighted fusion reduced peak-hour MAPE from 16.3% to 6.1.

The time-varying weight analysis revealed important seasonal patterns in provincial contributions, with weight distributions shifting from Jiangsu dominance in summer months to increased Anhui contribution in winter. These weight redistributions aligned with the changing load characteristics and forecasting challenges across seasons, confirming the adaptive capability of our methodology to recognize and leverage regional specialization.

The multi-temporal scale analysis demonstrated the robust performance of weighted fusion across forecasting horizons ranging from ultra-short-term (1 h) to medium-term (168 h). This consistent outperformance across all temporal scales validates the methodology’s adaptability to diverse operational requirements, from real-time dispatch decisions to day-ahead market operations.

5.3. Practical Implications

From a practical implementation perspective, our methodology offers a balanced integration approach that leverages the strengths of individual regional forecasts without requiring substantial modifications to existing forecasting systems. The computational efficiency of our approach—with processing times averaging 15.37 s—further enhances its viability for operational deployment in dispatch centers, where rapid decision-making is essential.

The performance improvements demonstrated in this study have significant implications for grid reliability, economic operation, and renewable energy integration. More accurate grid-level forecasting enables better generation scheduling, reduced reserve requirements, and more confident accommodation of variable renewable resources. These benefits directly translate into operational cost savings and enhanced grid stability.

5.4. Future Research Directions

While this research has focused on short-term load forecasting integration, several promising avenues for future work emerge from our findings. These include the following: (1) an integration with extreme-weather-event prediction models; (2) the application of explainable AI techniques to increase transparency; (3) the development of adaptive weighting schemes with continuous learning capabilities; (4) an extension to incorporate emerging distributed energy resource data; and (5) an integration with edge computing architectures for more resilient implementation.

In conclusion, our weighted fusion methodology addresses a critical gap in the load forecasting literature by focusing on the integration challenge rather than individual forecasting techniques. By systematically evaluating and dynamically combining regional forecasts based on their demonstrated reliability and significance, our approach provides a practical solution for enhancing grid-level load forecasting accuracy in modern power systems characterized by increasing complexity and renewable energy penetration.

Author Contributions

Conceptualization, H.Y. and K.Z.; methodology, X.T. and K.Z.; software, B.S.; validation, M.Z.; formal analysis, G.H.; investigation, H.Y.; resources, X.T.; data curation, B.S.; writing—original draft preparation, K.Z.; writing—review and editing, M.Z.; visualization, G.H.; supervision, H.Y.; project administration, K.Z.; funding acquisition, H.Y. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the Science and Technology Project of East China Branch of State Grid under Grant No. 529924240008.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data presented in this study are available on request from the corresponding author. The data are not publicly available due to involving national security protection.

Acknowledgments

We would like to extend our sincere gratitude to all project collaborators and contributors for their invaluable insights and support throughout the research.

Conflicts of Interest

Authors Hai Ye, Xiaobi Teng, Bingbing Song were employed by the company East China Branch of State Grid Corporation of China. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

Liu, M.-Y.; Xiao, Y.; Zhang, K.; Zhang, H.; Li, L. Collaborative optimization operation of high proportion renewable energy system considering source-load uncertainty. Electr. Power Constr. 2018, 39, 55–62. [Google Scholar]
Li, L.; Fan, S.; Xiao, J.; Zhou, H.; Shen, Y.; He, G. Fair trading strategy in multi-energy systems considering design optimization and demand response based on consumer psychology. Energy 2024, 306, 132993. [Google Scholar] [CrossRef]
Zheng, X.; Yang, M.; Yu, Y.; Wang, C. Short-term net load forecasting for regions with distributed photovoltaic systems based on feature reconstruction. Appl. Sci. 2023, 13, 9064. [Google Scholar] [CrossRef]
Liao, Q.-S.; Hu, W.-H.; Cao, D. Distributed photovoltaic net load forecasting in new energy power systems. J. Shanghai Jiao Tong Univ. 2021, 55, 1520–1531. [Google Scholar]
Zhang, Y.; Fan, S.; Meng, Y.; He, G. Payment and incentive allocation method in demand response programs based on Causer Pays principle. In Proceedings of the 2023 IEEE/IAS Industrial and Commercial Power Systems Asia, Chongqing, China, 7–9 July 2023; IEEE: New York, NY, USA, 2023; pp. 2212–2220. [Google Scholar]
Shao, Y.; Fan, S.; Meng, Y.; Jia, K.; He, G. Personalized demand response based on sub-CDL considering energy. Appl. Energy 2024, 374, 123964. [Google Scholar] [CrossRef]
Meng, Y.; Fan, S.; Shen, Y.; Xiao, J.; He, G.; Li, Z. Transmission and distribution network-constrained large-scale demand response based on locational customer directrix load for accommodating renewable energy. Appl. Energy 2023, 350, 121681. [Google Scholar] [CrossRef]
Peng, Y.; Wang, Y.; Lu, X.; Li, H.; Shi, D.; Wang, Z.; Li, J. Short-term load forecasting at different aggregation levels with predictability analysis. In Proceedings of the 2019 IEEE Innovative Smart Grid Technologies-Asia (ISGT Asia), Chengdu, China, 21–24 May 2019; IEEE: New York, NY, USA, 2019; pp. 3385–3390. [Google Scholar]
Yuan, Q.; Pi, Y.; Kou, L.; Zhang, F.; Li, Y.; Zhang, Z. Multi-source data processing and fusion method for power distribution internet of things based on edge intelligence. Front. Energy Res. 2022, 10, 891867. [Google Scholar] [CrossRef]
Chodakowska, E.; Nazarko, J.; Nazarko, Ł. ARIMA models in electrical load forecasting and their robustness to noise. Energies 2021, 14, 7952. [Google Scholar] [CrossRef]
Kang, C.-Q.; Xia, Q.; Liu, M. Electric Power System Load Forecasting, 2nd ed.; China Electric Power Press: Beijing, China, 2017; pp. 2–25. [Google Scholar]
Groß, A.; Lenders, A.; Schwenker, F.; Braun, D.A.; Fischer, D. Comparison of short-term electrical load forecasting methods for different building types. Energy Inform. 2021, 4 (Suppl. S3), 13. [Google Scholar] [CrossRef]
Tai, N.-L.; Hou, Z.-J.; Li, T.; Jiang, C.; Song, J. Short-term load forecasting method for power system based on wavelet analysis. Proc. CSEE 2003, 23, 46–51. [Google Scholar]
Zhang, D.-H.; Sun, K.; He, J.-H. Short-term load forecasting based on similar days and multi-model fusion. Power Syst. Technol. 2023, 47, 1961–1970. [Google Scholar]
Huang, N.; Hu, Z.; Cai, G.; Yang, D. Short-term electrical load forecasting using mutual information-based feature selection with generalized minimum-redundancy and maximum-relevance criteria. Entropy 2016, 18, 330. [Google Scholar] [CrossRef]
Zhang, P.-X.; Yin, X.-H.; Li, S.-Y.; Wang, L. Short-term electricity load forecasting for agricultural greenhouse park based on VMD-CNN-LSTM. Inf. Control 2023, 53, 238–249. [Google Scholar]
Zhu, M.-J.; Wang, C.-B.; Wang, T.-Z. An adaptive power load forecasting method based on multi-source data. Water Resour. Power 2017, 35, 200–203. [Google Scholar]
Hu, L.; Wang, J.; Guo, Z.; Zheng, T. Load forecasting based on LVMD-DBFCM load curve clustering and the CNN-IVIA-BLSTM model. Appl. Sci. 2023, 13, 7332. [Google Scholar] [CrossRef]
Wang, H.-X.; Wang, B.; Chen, H.-K.; Liu, C.; Ma, F. Power data fusion: Basic concepts, abstract structure, key technologies and application scenarios. Power Supply Consum. 2020, 37, 24–32. [Google Scholar]
Bouktif, S.; Fiaz, A.; Ouni, A.; Serhani, M.A. Multi-sequence LSTM-RNN deep learning and metaheuristics for electric load forecasting. Energies 2020, 13, 391. [Google Scholar] [CrossRef]
Chen, Z.-Y.; Liu, J.-B.; Li, C.; Ji, X.; Li, D.; Huang, Y.; Di, F.; Gao, X.; Xu, L. Ultra-short-term power load forecasting based on LSTM and XGBoost combined model. Power Syst. Technol. 2020, 44, 614–620. [Google Scholar]
Semmelmann, L.; Henni, S.; Weinhardt, C. Load forecasting for energy communities: A novel LSTM-XGBoost hybrid model based on smart meter data. Energy Inf. 2022, 5 (Suppl. S1), 24. [Google Scholar] [CrossRef]
Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention is all you need. Adv. Neural Inf. Process. Syst. 2017, 30, 1–11. [Google Scholar]
Liu, Y.; Hu, T.; Zhang, H.; Wu, H.; Wang, S.; Ma, L.; Long, M. iTransformer: Inverted transformers are effective for time series forecasting. arXiv 2023, arXiv:2310.06625. [Google Scholar]
Wang, W.; Feng, B.; Huang, G.; Liu, Z.; Ji, W. Short-term net load forecasting based on self-attention encoder and deep neural network. Proc. CSEE 2023, 43, 9072–9084. [Google Scholar]
Zhang, S.; Liu, W.-X.; Tang, H.-Y.; Ma, Y.-J.; Wan, H.-Y.; Lu, Y. A short-term load forecasting method based on Transformer multi-feature fusion. J. North China Electr. Power Univ. (Nat. Sci. Ed.) 2023, 49, 1–9. [Google Scholar]
Meng, W.; Yu, B.; Bai, L.; Xu, J.; Gu, J.-H.; Guo, F. Short-term power net load forecasting based on STGCN-Transformer. China Meas. Test 2024, 49, 1–9. [Google Scholar]
Wang, M.; Xu, X.; Yan, Z. Online fault diagnosis of PV array considering label errors based on distributionally robust logistic regression. Renew. Energy 2023, 203, 68–80. [Google Scholar] [CrossRef]
Wang, H.-L.; Zhang, Y.-J.; Mao, H.-P. Electric vehicle charging load forecasting method based on instantaneous charging probability. Electr. Power Autom. Equip. 2019, 39, 207–213. [Google Scholar]
Fan, S.-X.; Liu, X.-W.; Yu, Y.-J.; Zhang, W. Ultra-short term bus load forecasting method based on multi-source data and model fusion. Power Syst. Technol. 2021, 45, 243–250. [Google Scholar]
Singh, D.; Pal, N.; Sinha, S.K. Technical investigation on operational challenges of large-scale PV integration and opportunities with market restructuring, storages, green corridors, and AI. Microsyst. Technol. 2024, 30, 1109–1122. [Google Scholar] [CrossRef]
Lu, D.-L.; Guo, J.-Y.; Wu, Y. Power system load forecasting method based on multi-source data driven under the influence of COVID-19. Power Supply Consum. 2022, 39, 74–80. [Google Scholar]
Long, Y.; Ruan, W.-J.; Liu, M.; Zhou, Y. Research on medium and long-term probabilistic load forecasting method based on data fusion. Power Demand Side Manag. 2024, 26, 9–15. [Google Scholar]
Yang, M.; Chang, H.; Li, W.; Wang, H.; Lin, J.; Tong, Z.; Zhang, W. High-Temperature Deformation and Skid Resistance of Steel Slag Asphalt Mixture Under Heavy Traffic Conditions. Buildings 2024, 14, 3990. [Google Scholar] [CrossRef]
Zulfiqar, M.; Kamran, M.; Rasheed, M.B.; Alquthami, T.; Milyani, A.H. A short-term load forecasting model based on self-adaptive momentum factor and wavelet neural network in smart grid. IEEE Access 2022, 10, 77587–77602. [Google Scholar] [CrossRef]
He, X.; Dong, H.; Yang, W.; Li, W. Multi-source information fusion technology and its application in smart distribution power system. Sustainability 2023, 15, 6170. [Google Scholar] [CrossRef]
Zakaria, J. Principal Component Analysis (PCA) Explained; Built In: Chicago, IL, USA, 2023. [Google Scholar]

Figure 1. Steps for constructing an evaluation indicator system.

Figure 2. Multi-regional data preprocessing flow chart.

Figure 3. Shanghai load forecast results for 1 July 2024.

Figure 4. Shanghai load forecast results for July 2024.

Figure 5. Model total comparison.

Figure 6. June 2024 forecast comparison (The yellow shadow indicates the peak area).

Figure 7. June 2024 hourly error (The yellow shadow indicates the peak area).

Figure 8. June 2024 scatter comparison.

Figure 9. August 2024 forecast comparison (The yellow shadow indicates the peak area).

Figure 10. August 2024 hourly error (The yellow shadow indicates the peak area).

Figure 11. August 2024 scatter comparison.

Figure 12. December 2024 forecast comparison (The yellow shadow indicates the peak area).

Figure 13. December 2024 hourly error (The yellow shadow indicates the peak area).

Figure 14. December 2024 scatter comparison.

Figure 15. Visual comparisons of provincial evaluation radar.

Figure 16. Comparisons of time scale accuracy.

Figure 17. Comparison of peak and non-peak metrics.

Figure 18. Comparison of two regions’ metrics.

Figure 19. Time-varying weights between peak and non-peak.

Figure 20. Two regions’ time-varying weight divergence.

Figure 21. Heatmap of June weights.

Figure 22. Heatmap of August weights.

Figure 23. Heatmap of December weights.

Table 1. Load forecasting evaluation index system.

Primary Indicators	Secondary Indicators	Tertiary Indicators
Forecast Reliability	Historical Forecast Performance	Day-ahead Forecast Accuracy, Real-time Forecast Accuracy, Extreme Weather Forecast Accuracy
	Forecasting System Stability	System Update Frequency, Forecast Result Continuity, Abnormal Forecast Proportion
	Data Quality Level	Data Completeness Rate, Data Timeliness, Data Consistency
Provincial Load Impact	Load Scale Proportion	Maximum Load Proportion, Average Load Proportion, Peak Load Contribution Rate
Provincial Load Impact	Regulation Capability	Peak Regulation Capacity Ratio, Renewable Energy Installation Ratio, Demand Response Capability
Forecasting Complexity	Load Fluctuation Characteristics	Daily Load Fluctuation Rate, Weekly Load Fluctuation Rate, Seasonal Fluctuation Intensity
	External Factor Sensitivity	Temperature Sensitivity, Humidity Sensitivity, Holiday Sensitivity
	Electricity Consumption Structure Complexity	Industrial Electricity Proportion, Number of Key Users, User Type Diversity

Table 2. Performance comparison between weighted fusion and direct aggregation.

Period	Average MAPE (%)		Improvement Percentage (%)
Period	Weighted Fusion	Direct Aggregation	Overall	Peak Hours	Non-Peak Hours
June 2024	1.92	5.36	64.2	69.3	57.8
August 2024	3.47	9.82	64.7	62.6	68.4
December 2024	1.41	4.23	66.7	59.1	71.6

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Ye, H.; Teng, X.; Song, B.; Zou, K.; Zhu, M.; He, G. Multi-Source Data Fusion-Based Grid-Level Load Forecasting. Appl. Sci. 2025, 15, 4820. https://doi.org/10.3390/app15094820

AMA Style

Ye H, Teng X, Song B, Zou K, Zhu M, He G. Multi-Source Data Fusion-Based Grid-Level Load Forecasting. Applied Sciences. 2025; 15(9):4820. https://doi.org/10.3390/app15094820

Chicago/Turabian Style

Ye, Hai, Xiaobi Teng, Bingbing Song, Kaiming Zou, Moyan Zhu, and Guangyu He. 2025. "Multi-Source Data Fusion-Based Grid-Level Load Forecasting" Applied Sciences 15, no. 9: 4820. https://doi.org/10.3390/app15094820

APA Style

Ye, H., Teng, X., Song, B., Zou, K., Zhu, M., & He, G. (2025). Multi-Source Data Fusion-Based Grid-Level Load Forecasting. Applied Sciences, 15(9), 4820. https://doi.org/10.3390/app15094820

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Multi-Source Data Fusion-Based Grid-Level Load Forecasting

Abstract

1. Background and Motivation

1.1. Literature Review and Research Gaps

Research Gaps in Grid-Level Forecasting Integration

1.2. Research Contributions

1.3. Paper Organization

2. Materials and Methods

2.1. Challenges of Grid-Level Power Load Forecasting Integration

2.2. Provincial Forecasting Methodologies

2.3. Comprehensive Evaluation Index System

2.4. PCA-Based Weight Determination Framework

2.4.1. Evaluation Matrix Construction

2.4.2. PCA-Based Weight Calculation Process

2.4.3. Time-Varying Weight Matrix

2.4.4. Balanced Fusion Concept

2.5. Implementation Framework for Multi-Regional Load Forecasting Integration

2.5.1. Multi-Regional Integration Framework Overview

2.5.2. Multi-Source Data Preprocessing

2.5.3. Provincial Forecast Generation

2.5.4. Multi-Regional Forecast Fusion

3. Results

3.1. Experimental Setup

3.1.1. Hardware and Software Configuration

3.1.2. Dataset Description

3.1.3. Methodological Approach

3.1.4. Evaluation Metrics

3.2. Regional Forecasting Performance Analysis

3.2.1. Daily Load Pattern Analysis for Shanghai

3.2.2. Monthly Rolling Forecast Performance

3.3. Evaluation of Weighted Fusion Methodology

3.3.1. Experimental Methodology

3.3.2. Moderate Condition Analysis (June 2024)

3.3.3. High-Temperature Condition Analysis (August 2024)

3.3.4. Winter Condition Analysis (December 2024)

3.3.5. Provincial Evaluation Analysis

3.3.6. Computational Efficiency Analysis

3.3.7. Summary of Performance Improvements

3.4. Multi-Temporal Scale Analysis

3.5. Relationship Between Evaluation Indicators and Forecasting Performance

3.5.1. Comparative Analysis of Regional Evaluation Indicators

3.5.2. Time-Varying Weight Dynamics

3.5.3. Seasonal Variations in Provincial Weight Dynamics

June (Moderate-Temperature Conditions)

August (High-Temperature Conditions)

December (Winter Conditions)

Cross-Seasonal Comparative Analysis

3.5.4. Implications for Multi-Regional Integration

4. Discussion

4.1. Performance Improvements in Critical Operational Periods

4.2. Temporal Adaptability and Regional Specialization

4.3. Balanced Integration of Complementary Forecasting Strengths

4.4. Comparison with Existing Fusion Approaches

4.5. Limitations and Future Work

5. Conclusions

5.1. Methodological Contributions

5.2. Empirical Findings

5.3. Practical Implications

5.4. Future Research Directions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI