Abstract
Traditional single-prediction models often exhibit limitations in meeting wind power prediction requirements in complex operational scenarios. Furthermore, the inherent “black-box” nature of deep learning models leads to limited interpretability of predictions, hindering effective support for grid dispatch planning. To address these issues, this study proposes a novel day-ahead wind power prediction method, referred to as SHapley Additive exPlanations (SHAP)–Mixture of Experts (MoE), which integrates SHAP into an MoE framework. Here, SHAP is employed for interpretability purposes. This study innovatively transforms SHAP analysis into prior knowledge to guide the decision-making of the MoE gating network and proposes a two-layer dynamic interpretation mechanism based on the collaborative analysis of gating weights and SHAP values. This approach clarifies key meteorological factors and the model’s advantageous scenarios, while quantifying the uncertainty among multiple expert decisions. Firstly, each expert model was pre-trained, and its parameters were frozen to construct a candidate expert pool. Secondly, the SHAP vectors for each pre-trained expert were computed over all sample features to characterize their decision-making logic under varying scenarios. Thirdly, an augmented feature set was constructed by fusing the original meteorological features with SHAP attribution matrices from all experts; this set was used to train the gating network within the MoE framework. Finally, for new input samples, each frozen expert model generates a prediction along with its corresponding SHAP vector, and the gating network aggregates these predictions to produce the final forecast. The proposed method was validated using operational data from an offshore wind farm located in southeastern China. Compared with the best individual expert model and traditional ensemble forecasting models, the proposed method reduces the Root Mean Square Error (RMSE) by 0.23% to 4.92%. Furthermore, the method elucidates the influence of key features on each expert’s decisions, offering insights into how the gating network adaptively selects experts based on the input features and expert-specific characteristics across different scenarios.