Abstract
In corporate management practices, human resources are among the most active and critical elements, and frequent employee turnover can impose substantial losses on firms. Accurately predicting employee turnover dynamics and identifying turnover propensity in advance is therefore of significant importance for organizational development. To improve turnover prediction performance, this study proposes an employee turnover prediction model that integrates an improved ADASYN data rebalancing algorithm with a grid-search-optimized CatBoost classifier. In practice, turnover instances typically constitute a minority class; severe class imbalance may lead to overfitting or underfitting and thus degrade predictive performance. To mitigate imbalance, we employ ADASYN oversampling to reduce skewness in the dataset. However, because ADASYN is primarily designed for continuous features, it may generate invalid or meaningless values when discrete variables are present. Accordingly, we improve ADASYN by introducing a new distance metric and an enhanced sample generation strategy, making it applicable to turnover data with mixed (continuous and discrete) features. Given CatBoost’s strong predictive capability in high-dimensional settings, we adopt CatBoost as the base learner. Nonetheless, CatBoost performance is highly sensitive to hyperparameter choices, and different parameter combinations can yield markedly different results. Therefore, we apply grid search (GS) to efficiently optimize CatBoost hyperparameters and obtain the best-performing configuration. Experimental results on three datasets demonstrate that the proposed improved-ADASYN GS-CatBoost model effectively enhances turnover prediction performance, exhibiting strong robustness and adaptability. Compared with existing models, our approach improves predictive accuracy by approximately 4.6112%.