Abstract
Data-driven evolutionary algorithms (DDEAs) are essential computational intelligent methods for solving expensive optimization problems (EOPs). The management of surrogate models for fitness predictions, particularly the selection and integration of multiple models, is key to their success. However, how to select and integrate models to obtain accurate predictions remains a challenging issue. This paper proposes a novel Gumbel-based selection DDEA named GBS-DDEA, which innovates in both aspects of model selection and integration. First, a Gumbel-based selection (GBS) strategy is proposed to probabilistically choose surrogate models. GBS employs the Gumbel-based distribution to strike a balance between exploiting high-accuracy models and exploring others, providing a more principled and robust selection strategy than conventional probability sampling. Second, a ranking-based weighting ensemble (RBWE) strategy is developed. Instead of relying on absolute error metrics that can be sensitive to outliers, RBWE assigns integration weights based on the models’ relative performance rankings, leading to a more stable and reliable ensemble prediction. Comprehensive experiments on various benchmark problems and a Chinese text-based cheating official accounts mining problem demonstrate that GBS-DDEA consistently outperforms several state-of-the-art DDEAs, confirming the effectiveness and superiority of the proposed dual-strategy approach.