This section evaluates the data balancing performance of the improved ADASYN algorithm and the predictive performance of the IADASYN-GS-CatBoost model. The 10-fold cross-validation strategy was employed in the experiments, with the results of 10 iterations recorded and their average taken as the final evaluation metric.
4.4.1. Comparative Experiments
To evaluate the effectiveness of IADASYN-GS-CatBoost in employee turnover prediction, the proposed method was compared with several hybrid feature processing approaches, including Logistic Regression (LR) [
47], Passive-Aggressive Stochastic Gradient Descent (PMSGD) [
13], and Ensemble Random Forest (Ensemble RF) [
48].
Table 3,
Table 4 and
Table 5 present the prediction results of the four methods on Dataset-I, Dataset-II, and Dataset-III, respectively.
As shown in
Table 3, for Dataset-I, the proposed IADASYN-GS-CatBoost method achieved an average Accuracy, Recall, False Positive Rate, Precision, and AUC of 93.29%, 92.01%, 93.48%, 92.30%, and 93.24%, respectively. Across all five metrics, it consistently outperformed LR, PMSGD, and Ensemble RF. This superior performance is attributed to the improved ADASYN algorithm, which selectively increases the number of samples in locally similar feature spaces, thus achieving localized balancing and avoiding the indiscriminate oversampling issues typical of traditional SMOTE algorithms. Furthermore, the improvement effectively addresses the challenges posed by the coexistence of continuous and discrete features in employee turnover data, thereby enhancing classification accuracy. Additionally, the CatBoost model, based on the boosting framework, effectively reduces residual errors during training, improving predictive accuracy. The use of Grid Search (GS) for hyperparameter optimization further helps prevent overfitting, thereby enhancing both predictive performance and the model’s generalization capability.
For Dataset-II, the proposed IADASYN-GS-CatBoost method achieved an Accuracy, Precision, Recall, F1-Score, and AUC of 96.81%, 94.01%, 98.99%, 96.91%, and 96.80%, respectively. Compared with LR, PMSGD, and Ensemble RF methods, the proposed method outperformed all others across the five metrics. Specifically, compared to the relatively strong-performing Ensemble RF method, the proposed method improved Accuracy, Precision, Recall, F1-Score, and AUC by approximately 6.28%, 1.29%, 8.28%, 4.67%, and 4.84%, respectively.
For Dataset-III, the proposed method also demonstrated the best performance across all five metrics. Compared to the PMSGD method, which balances data using the SMOTE algorithm and optimizes decision trees with a genetic algorithm, the proposed method achieved improvements of approximately 6.31%, 3.40%, 7.72%, 5.64%, and 6.20% in Accuracy, Precision, Recall, F1-Score, and AUC, respectively. Furthermore, compared to the ensemble-based Ensemble RF method, the proposed approach improved the five metrics by approximately 3.93%, 2.07%, 7.12%, 4.67%, and 3.99%, respectively.
The experimental results across all three datasets demonstrate that the IADASYN-GS-CatBoost method not only achieves superior predictive performance but also exhibits strong stability and generalization ability when applied to different employee turnover datasets.
4.4.2. Ablation Experiments
The ablation study consists of two parts: I) evaluation of the superiority of the GS-CatBoost model, and II) validation of the effectiveness of the improved ADASYN sample generation method.
This section evaluates the predictive performance of the GS-CatBoost model for employee turnover prediction.
Table 6,
Table 7 and
Table 8 present the experimental comparison results of GS-CatBoost against classical classification algorithms on Dataset-I, Dataset-II, and Dataset-III, respectively.
As shown in
Table 6,
Table 7 and
Table 8, for Dataset-I, the proposed GS-CatBoost method achieved average Accuracy, Precision, Recall, F1-Score, and AUC values of 87.29%, 88.89%, 36.37%, 51.62%, and 67.66%, respectively. For Dataset-II, the GS-CatBoost method achieved Accuracy, Precision, Recall, F1-Score, and AUC values of 94.41%, 92.47%, 96.62%, 94.50%, and 94.46%, respectively. For Dataset-III, the GS-CatBoost method achieved Accuracy, Precision, Recall, F1-Score, and AUC values of 96.95%, 92.78%, 94.12%, 93.44%, and 95.96%, respectively. The experimental results indicate that across all datasets, GS-CatBoost consistently achieved the best performance across all five evaluation metrics.
By comparing Accuracy and Precision, it can be observed that the GS-CatBoost method outperforms the five classical prediction algorithms. Compared with the CatBoost algorithm, which had the highest performance among the comparison models, the proposed method achieved an average improvement of 8.90% in Accuracy and 3.91% in Precision. This indicates that the GS-CatBoost method proposed in this study provides the highest prediction accuracy for both turnover and retained employees.
For Dataset-I, the Recall rates of the comparison algorithms (RF, LSSVM, NB, BPNN, CatBoost) were all below 32%, and their F1-Scores were all below 47%. For Dataset-II, the Recall rates remained below 95% and the F1-Scores below 93%. For Dataset-III, the Recall rates were below 93%, and the F1-Scores were below 92%. In contrast, GS-CatBoost achieved Recall rates of 36.37%, 96.62%, and 94.12%, and F1-Scores of 51.62%, 94.50%, and 93.44% across the three datasets, significantly outperforming the other methods. Moreover, the AUC values obtained by the five comparison algorithms on the three datasets were all below 90%, noticeably lower than those achieved by GS-CatBoost.
These results demonstrate that through optimized parameter tuning, GS-CatBoost not only improves the prediction of turnover employees but also substantially enhances the prediction performance for retained employees. However, it can also be observed that, due to the impact of data imbalance, the overall prediction accuracy for turnover employees remains relatively lower across all algorithms.
To validate the effectiveness of the proposed improved ADASYN, we utilized the ADASYN algorithm and the IADASYN algorithm for data balancing, and then compared the prediction results for different models. In this experiment, the training data are rebalanced using both standard ADASYN and IADASYN. For each rebalanced dataset, we then conduct turnover prediction using five classifiers—RF, LSSVM, NB, BPNN, and GS-CatBoost—and compare the results under the same evaluation protocol. We employ five metrics (Accuracy, Precision, Recall, F1-score, and AUC), and report the average performance over three datasets as the final results. The experimental results are shown as
Table 9.
As shown in
Table 9 the datasets generated by IADASYN consistently outperform those generated by standard ADASYN to varying degrees. Specifically, (i) Accuracy increases across all five models, with an average improvement of approximately 1.342%; (ii) for Precision, IA-BPNN and IA-GS-CatBoost improve over A + BPNN and A-GS-CatBoost by 2.05% and 0.83%, respectively; (iii) for Recall, IA + BPNN and IA-GS-CatBoost improve by 1.55% and 1.03%, respectively; (iv) for F1-score, IADASYN-BPNN and IADASYN-GS-CatBoost improve by 1.06% and 1.14%, respectively; and (v) for AUC, improvements of 1.32% and 1.18% are observed for IADASYN-BPNN and IADASYN-GS-CatBoost, respectively. These results indicate that, across different datasets and predictive models, IADASYN leads to more balanced and stable training data, thereby improving predictive performance.
4.4.3. Comparison with Current Research
We compare the method proposed in this study against sixteen recently introduced approaches, conducting experiments on the public IBM human-resources dataset (Dataset-III), and the prediction accuracies of all methods are reported in
Table 10. From
Table 10, reference [
44,
49] employed Decision Tree models for attrition prediction and achieved an accuracy of 83.44%. Logistic Regression (LR), owing to its solid predictive performance, has been adopted in multiple studies: reference [
9,
50] reported prediction accuracies of 87.50% and 87.00%, respectively, while reference [
8] further improved accuracy to 88.43% by ensemble LR models. Support Vector Machines (SVMs) have likewise been widely used. Reference [
51] attained 88.44% accuracy with SVM; reference [
13] reported 84% accuracy using SVM; and reference [
52] achieved 92.50% using linear combinations of SVMs. Random Forest (RF) was applied in reference [
14,
53,
54], yielding accuracies of 80%, 85.11%, and 87.298%, respectively.
Boosting-based methods have demonstrated high predictive accuracy in employee attrition prediction. Reference [
13] combined XGBoost obtained 86.02% accuracy, and reference [
13] used CatBoost to increase accuracy to 89.45%. Deep learning has also attracted considerable attention: reference [
55] employed deep neural networks (DNNs) to reach 89.11% accuracy, reference [
56] employed genetic algorithm optimization of a deep encoder and KNN (GA-Deep-Autoencoder-KNN) to reach 90.95% accuracy, and reference [
57] used an ensemble bidirectional temporal convolutional network (Ensemble Bi-TCN) to achieve 92.17%. Our IADASYN-GS-CatBoost model likewise attains high predictive accuracy, with an average accuracy of 97.48%.
Compared with the Linear-SVM with feature fusion and Ensemble Bi-TCN + GAN methods, the accuracy of our method improves 4.98% and 5.31%, respectively; compared with the remaining models, it exhibits a more pronounced advantage in attrition-prediction accuracy.
Table 10.
Comparing the proposed model with current research on Dataset-II.
Table 10.
Comparing the proposed model with current research on Dataset-II.
| Reference | Year | Model | Accuracy |
|---|
| Reference [49] | 2019 | Decision Tree (DTJ | 83.44% |
| Reference [9] | 2020 | Logistic Regression (LR) | 87.50% |
| Reference [50] | 2020 | Logistic Regression (LR) | 87.00% |
| Reference [8] | 2021 | Ensemble LR (ELR) | 88.43% |
| Reference [51] | 2020 | Support vector machine(SVM) | 88.44% |
| Reference [14] | 2021 | Random Forest (RF) | 87.30% |
| Reference [41] | 2021 | LR with feature selection | 81.00% |
| Reference [55] | 2021 | Deep Neural Network (DNN) | 89.11% |
| Reference [58] | 2021 | Artificial Neural Network(ANN) | 84.00% |
| Reference [54] | 2021 | Random Forest (RF) | 85.11% |
| Reference [53] | 2022 | k-Nearest Neighbor (KNN) | 84.00% |
| Reference [13] | 2022 | CatBoost | 89.45% |
| Reference [13] | 2023 | XGboost | 86.02% |
| Reference [52] | 2024 | Linear-SVM with feature fusion | 92.50% |
| Reference [56] | 2024 | GA-DeepAutoencoder-KNN | 90.95% |
| Reference [57] | 2025 | Ensemble Bi-TCN + GAN | 92.17% |
| Proposed method | 2025 | IADASYN-GS-CatBoost | 97.48% |