Predicting Employee Turnover Based on Improved ADASYN and GS-CatBoost

Shuigen Hu; Kai Dong

doi:10.3390/math14020313

and

School of Public Affairs, Zhejiang University, Hangzhou 310030, China

^*

Author to whom correspondence should be addressed.

Mathematics2026, 14(2), 313;https://doi.org/10.3390/math14020313

This article belongs to the Section E5: Financial Mathematics

Version Notes

Order Reprints

Abstract

In corporate management practices, human resources are among the most active and critical elements, and frequent employee turnover can impose substantial losses on firms. Accurately predicting employee turnover dynamics and identifying turnover propensity in advance is therefore of significant importance for organizational development. To improve turnover prediction performance, this study proposes an employee turnover prediction model that integrates an improved ADASYN data rebalancing algorithm with a grid-search-optimized CatBoost classifier. In practice, turnover instances typically constitute a minority class; severe class imbalance may lead to overfitting or underfitting and thus degrade predictive performance. To mitigate imbalance, we employ ADASYN oversampling to reduce skewness in the dataset. However, because ADASYN is primarily designed for continuous features, it may generate invalid or meaningless values when discrete variables are present. Accordingly, we improve ADASYN by introducing a new distance metric and an enhanced sample generation strategy, making it applicable to turnover data with mixed (continuous and discrete) features. Given CatBoost’s strong predictive capability in high-dimensional settings, we adopt CatBoost as the base learner. Nonetheless, CatBoost performance is highly sensitive to hyperparameter choices, and different parameter combinations can yield markedly different results. Therefore, we apply grid search (GS) to efficiently optimize CatBoost hyperparameters and obtain the best-performing configuration. Experimental results on three datasets demonstrate that the proposed improved-ADASYN GS-CatBoost model effectively enhances turnover prediction performance, exhibiting strong robustness and adaptability. Compared with existing models, our approach improves predictive accuracy by approximately 4.6112%.

Keywords:

employee turnover prediction; CatBoost; ADASYN; imbalanced data; grid search

Article Metrics

Citations

Article Access Statistics

Journal Statistics

Article metric data becomes available approximately 24 hours after publication online.