Abstract
With the increasing penetration of renewable energy in China’s power system, wide-band oscillations with multiple modes have emerged, posing new challenges to the assessment of renewable energy oscillation hosting capacity. At present, the construction of artificial intelligence-based assessment models still relies heavily on researchers’ subjective experience when selecting input features, which lacks theoretical justification. Moreover, the expansion of system scale increases data dimensionality and introduces a higher risk of model overfitting. To address these issues, this paper proposes a two-stage key feature selection method based on participation factors and XGBoost. First, the participation factor theory is utilized to establish the functional mapping between system electrical quantities and oscillatory characteristics, enabling an initial identification of the electrical variables most relevant to renewable energy oscillation hosting capacity. Second, to mitigate the curse of dimensionality brought by large-scale systems, a variational autoencoder is employed to compress the initial feature set and extract its latent representations. Finally, XGBoost is applied to these latent representations to further identify the most critical features that accurately reflect the oscillation hosting capacity of renewable energy. Experimental results on a wide-band oscillation dataset show that active power achieves the highest importance score among all features; moreover, a model using only active-power data attains an accuracy of approximately 97%, demonstrating its effectiveness as the most strongly correlated and least redundant key feature subset.