The Counterfactual–Dialectical Optimization Framework: A Prescriptive Approach to Employee Attrition Management with Empirical Validation
Abstract
1. Introduction
- We formalize a multi-layered methodology that synergistically combines predictive risk modeling, robust causal inference, and budget-constrained optimization to generate actionable, ROI-driven retention plans.
- To directly address the limitations of purely synthetic analyses, we first illustrate the framework’s mechanics on a synthetic dataset, then conduct a full proof-of-concept on the empirical ‘Saudi Employee Attrition Dataset,’ grounding our validation in a real-world context.
- Our framework moves beyond simple risk scores by explicitly incorporating real-world constraints, including limited budgets and a novel Employee Importance Score, ensuring resources are strategically allocated to critical, at-risk employees.
2. Background and Literature Review
2.1. The Evolving Landscape of AI in HR and the Challenge of Fairness
2.2. Predictive Modelling of Employee Attrition
2.3. Causal Inference and Uplift Modelling in Business
2.4. Optimization and Resource Allocation in HR
2.5. The Present Contribution: A Synthesis
3. The Proposed Framework
3.1. Overall Framework Architecture
- Input Data: The framework ingests historical employee data, including demographic, role-based, and behavioral features, alongside the historical attrition outcome.
- Predictive & Causal Layers: This data is processed in parallel by the Predictive Layer, which estimates attrition risk, and the Causal Layer, which estimates the effect of potential interventions.
- Optimization Layer: The output from the first two layers (1) risk scores and (2) uplift scores, ATEs are fed into the Optimization Layer. This layer applies a novel Counterfactual–Dialectical optimization process to generate the final retention plan under a budget constraint [29].
- Outputs: The final outputs are a tactical retention plan, specifying which employee should receive which intervention, and strategic insights into the consistent drivers of both risk and intervention effectiveness.
3.2. Algorithmic Summary
3.3. Detailed Layer Descriptions
- Thesis: Identify the intervention with the maximum raw attrition reduction (most negative ATE).
- Constraint Check: Determine if this thesis intervention is affordable within the remaining budget.
- Antithesis & Synthesis: If the thesis is unaffordable, an antithesis is generated by identifying all other affordable interventions. A synthesis is then formed by selecting the affordable intervention that offers the highest ROI, calculated as:
- Allocation: The synthesized (or original thesis) plan is assigned, and the budget is updated.
- One explaining attrition risk (from the predictive Random Forest model).
- Another explaining the likelihood of receiving a treatment (from the L1-regularized propensity score model).
4. Methodology and Experimental Setup
4.1. The CDO Framework: Formal Definitions
4.1.1. Predictive Layer: Attrition Risk Stratification
| Algorithm 1: Generating a Key Prioritization Input: Attrition Risk Score |
| Input: Dataset D = {(xe, ye)}e=1n Output: Vector of risk scores R = {e} for e∈E 1: procedure GenerateRiskScores(D) 2: Mrisk ← TrainClassifier(D) // such as RandomForest 3: for each employee e in D do 4: e ← Mrisk.predict_proba(xe) 5: end for 6: return R 7: end procedure |
4.1.2. Causal Layer: Intervention Effect Estimation
| Algorithm 2: ATE Estimation with PSM |
| Input: Dataset D = {(Xe, ye, Te,k)} for e = 1 to n Output: ATEk; estimate for intervention k 1: procedure GenerateATEviaPSM(D, k) 2: // Stage 1: Estimate Propensity Scores using a regularized model 3: Mpropensity← TrainClassifier(X, Tk) // e.g., L1-Regularized Logistic Regression 4: for each employee e in D do 5: e,k ← Mpropensity.predict_proba(Xe) // Calculate P(Tk = 1 | Xe) 6: end for 7: // Stage 2: Matching and Effect Estimation 8: MatchedPairs ← FindNearestNeighborMatches(e,k for Te,k = 1, e,k for Te,k = 0) 9: TreatedOutcomes ← Average(ye for treated employees in MatchedPairs) 10: ControlOutcomes ← Average(ye for control employees in MatchedPairs) 11: 12: ATEk ← TreatedOutcomes – ControlOutcomes 13: return ATEk 14: end procedure |
4.1.3. Optimization Layer: Dialectical Resource Allocation
| Algorithm 3: Intervention Allocation Heuristic |
| Input: Risk scores R, Employee Importance Scores W, ATEs, Costs C, Budget Btotal Output: Allocation Plan Π = {(e, k)} 1: procedure AllocateInterventions(R, W, ATEs, C, Btotal) 2: Epriority ← IdentifyHighPriorityEmployees(R) 3: Esorted ← Sort Epriority by priority_scoree in descending order 4: Bused ← 0, Π ← empty map 5: for each employee e ∈ Esorted do 6: if Bused ≥ Btotal then break end if 7: kbest ← argmax(k ∈ ) {|ATEk|/ck} such that e,k < 0 8: if kbest exists and (Bused + c(k)) ≤ Btotal* then 9: Π[e] ← kbest 10: Bused ← Bused + c(k) 11: end if 12: end for 13: return Π 14: end procedure |
4.1.4. Explainability Layer: CFCA
- The risk model (Irisk), to explain the drivers of attrition risk.
- The propensity score model (Ipropensity), to explain the drivers of receiving an intervention.
4.2. Experimental Design and Data
4.2.1. A Two-Stage Validation Approach
- Stage 1 (Methodological Illustration): To provide a clear, step-by-step demonstration of the CDO framework’s components, we first utilize the well-known, publicly available HR Analytics dataset. As a standard synthetic benchmark for attrition modeling, it provides a controlled environment to explain the functionality of the predictive, causal, and optimization layers. The full results of this illustrative analysis are presented in Section 5.
- Stage 2 (Empirical Validation): To test the framework’s utility in a more realistic and challenging context, we then conduct our main proof-of-concept on the Saudi Employee Attrition Dataset. This dataset, based on an empirical employee survey, provides the foundation for our main results, which are presented in Section 6.
4.2.2. Data Pre-Processing Pipeline
- Data Cleaning and Feature Removal: Unique identifier columns (e.g., EmployeeNumber) and columns with zero variance (e.g., EmployeeCount, Over18) that provide no predictive information were removed to create a clean and relevant feature set.
- Target Variable Encoding: The categorical Attrition column (with values ‘Yes’ and ‘No’) was converted into a binary integer format, where 1 represents attrition and 0 represents retention. This column serves as the outcome variable in all subsequent mathematical formulations.
- Missing Value Imputation: To preserve statistical power, missing values in numerical columns were imputed using the median, which is robust to outliers. Missing values in categorical columns were imputed using the mode, which is the most frequently occurring value.
- Feature Engineering and Encoding: For the empirical validation stage, an Employee Importance Score was engineered from salary and job title data to serve as a proxy for an employee’s strategic value. All remaining non-numerical features (e.g., Gender, MaritalStatus) were converted into a numerical format suitable for model training using appropriate mapping or encoding techniques.
- Feature Scaling: For distance-based algorithms like k-Nearest Neighbors (k-NN), which are sensitive to the scale of input data, the final numerical feature set was scaled using a StandardScaler. This technique transforms each feature to have a mean of 0 and a standard deviation of 1.
4.2.3. Intervention Set Definition and Optimization Parameters
- For the illustrative stage (HR dataset), interventions for ‘Bonus’, ‘Promotion’, and ‘Training’ were synthetically generated via random assignment.
- For the empirical validation stage (Saudi dataset), interventions were defined using real survey data as proxies: a Promotion Intervention (from the Get_Deserved_Promotion column) and a Compensation Intervention (from the Bonus column). The Training intervention was excluded from this stage due to insufficient data for a robust causal analysis.
5. Methodological Illustration on a Synthetic Dataset
5.1. Predictive Analysis: Attrition Risk Stratification
5.2. Illustrative Causal Effect Estimation
5.3. Illustrating the Optimization Layer: Allocation Results
5.4. Illustrating the Explainability Layer: CFCA Results
6. Empirical Proof-of-Concept on the Saudi Workforce Dataset
6.1. Predictive Analysis and Risk Stratification
6.2. Empirical Causal Effect Estimation
6.3. Optimization and Allocation Results
6.4. Integrated Analysis Dashboard
6.5. Framework Validation Analyses
6.5.1. Fairness Evaluation
6.5.2. Ablation Study: Validating the Framework Components
7. Discussion and Conclusions
7.1. Interpretation of Principal Findings
7.2. Managerial and Policy Implications
- Shift to Personalized, Causal-Driven Interventions: The framework demonstrates the feasibility of moving from broad, one-size-fits-all retention policies to a targeted strategy where interventions are personalized based on robust causal evidence and individual employee value.
- The Imperative of Causal Validation: The conflicting results for the ‘Promotion’ intervention between our illustrative and empirical analyses serve as a crucial cautionary tale. It highlights the danger of implementing policies based on intuition or generic benchmark studies and underscores the need for rigorous causal validation using an organization’s own data.
- Proactive Fairness and Algorithmic Governance: As established in the literature, there is an urgent need for more comprehensive ethical frameworks to govern AI in HR [14]. Our empirical analysis speaks directly to this challenge. The significant gender bias revealed by our Fairness Evaluation (Table 6) is not a failure of the framework, but rather a demonstration of its essential function as the type of diagnostic and governance tool called for by current research. It proves that a purely ROI-driven optimization can perpetuate biases, confirming the risks highlighted by [13]. This finding underscores that proactive fairness monitoring, as enabled by the CDO framework, is critical for the responsible deployment of AI.
7.3. Limitations and Avenues for Future Research
7.4. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
Abbreviations
| AI | Artificial Intelligence |
| ATE | Average Treatment Effect |
| AUC | Area Under the Curve |
| CATE | Conditional Average Treatment Effect |
| CDO | Counterfactual–Dialectical Optimization |
| CFCA | Consistent Feature Contribution Analysis |
| DML | Double Machine Learning |
| GBM | Gradient Boosting Machines |
| HR | Human Resources |
| k-NN | k-Nearest Neighbors |
| MCKP | Multiple-Choice Knapsack Problem |
| PSM | Propensity Score Matching |
| ROI | Return on Investment |
| SHAP | SHapley Additive exPlanations |
| VIF | Variance Inflation Factor |
References
- Urme, U.N. The Impact of Talent Management Strategies on Employee Retention. Int. J. Sci. Bus. 2023, 28, 127–146. [Google Scholar] [CrossRef]
- Al-Suraihi, W.A.; Samikon, S.A.; Al-Suraihi, A.A.; Ibrahim, I. Employee Turnover: Causes, Importance and Retention Strategies. Eur. J. Bus. Manag. Res. 2021, 6, 10. [Google Scholar] [CrossRef]
- Yashu; Sharma, R.; Jain, A.; Manwal, M. Enhancing Human Resource Management through Deep Learning: A Predictive Analytics Approach to Employee Retention Success. In Proceedings of the 2024 IEEE International Conference on Information Technology, Electronics and Intelligent Communication Systems (ICITEICS), Bangalore, India, 28–29 June 2024; pp. 1–4. [Google Scholar]
- Di Prima, C.; Cepel, M.; Kotaskova, A.; Ferraris, A. Help me help you: How HR analytics forecasts foster organizational creativity. Technol. Forecast. Soc. Change 2024, 206, 123540. [Google Scholar] [CrossRef]
- Jain, P.K.; Jain, M.; Pamula, R. Explaining and predicting employees’ attrition: A machine learning approach. SN Appl. Sci. 2020, 2, 757. [Google Scholar] [CrossRef]
- Pan, Y.; Zhan, P. The Impact of Sample Attrition on Longitudinal Learning Diagnosis: A Prolog. Front. Psychol. 2020, 11, 1051. [Google Scholar] [CrossRef]
- Weiss, M.; Zacher, H. Still Waters Run Deep: How Employee Silence Affects Instigated Workplace Incivility over Time. J. Bus. Ethics 2025, 20, 587–604. [Google Scholar] [CrossRef]
- Veloso, E.F.R.; Da Silva, R.C.; Dutra, J.S.; Fischer, A.L.; Trevisan, L.N. Talent Retention Strategies in Different Organizational Contexts and Intention of Talents to Remain in the Company. RISUS—Rev. Inovação Sustentabilidade 2014, 5, 49. [Google Scholar] [CrossRef][Green Version]
- Salas-Vallina, A.; Alegre, J.; López-Cabrales, Á. The challenge of increasing employees’ well-being and performance: How human resource management practices and engaging leadership work together toward reaching this goal. Hum. Resour. Manag. 2021, 60, 333–347. [Google Scholar] [CrossRef]
- Geerts, J.M. Maximizing the Impact and ROI of Leadership Development: A Theory- and Evidence-Informed Framework. Behav. Sci. 2024, 14, 955. [Google Scholar] [CrossRef]
- Hubbart, J.A. Organizational change: The challenge of change aversion. Adm. Sci. 2023, 13, 162. [Google Scholar] [CrossRef]
- D’amicantonio, S.; Kulangara, M.K.; Darshan Mehta, H.; Pal, S.; Levantesi, M.; Polignano, M.; Purificato, E.; De Luca, E.W. A Comprehensive Strategy to Bias and Mitigation in Human Resource Decision Systems. In Proceedings of the 5th Italian Workshop on Explainable Artificial Intelligence, Bolzano, Italy, 26–27 November 2024; pp. 11–27. [Google Scholar]
- Naoum, R. A Framework for Integrating AI-Powered Systems to Mitigate Bias Risk in HRMFunctions. Mark. Menedzsment 2025, 59, 52–61. [Google Scholar] [CrossRef]
- Bar-Gil, O.; Ron, T.; Czerniak, O. AI for the people? Embedding AI ethics in HR and people analytics projects. Technol. Soc. 2024, 77, 102527. [Google Scholar] [CrossRef]
- Ali, A.; Jayaraman, R.; Azar, E.; Maalouf, M. A comparative analysis of machine learning and statistical methods for evaluating building performance: A systematic review and future benchmarking framework. Build. Environ. 2024, 252, 111268. [Google Scholar] [CrossRef]
- Quinteros, D.M. Predictive Modelling of Employee Attrition Using Deep Learning. Acadlore Trans. AI Mach. Learn. 2023, 2, 212–225. [Google Scholar] [CrossRef]
- Nandal, M.; Grover, V.; Sahu, D.; Dogra, M. Employee Attrition: Analysis of Data Driven Models. EAI Endorsed Trans. Internet Things 2024, 10, 1–10. [Google Scholar] [CrossRef]
- Chung, D.; Yun, J.; Lee, J.; Jeon, Y. Predictive Model of Employee Attrition Based on Stacking Ensemble Learning. SSRN Electron. J. 2022. [Google Scholar] [CrossRef]
- Athey, S.; Imbens, G.W. The State of Applied Econometrics: Causality and Policy Evaluation. J. Econ. Perspect. 2017, 31, 3–32. [Google Scholar] [CrossRef]
- Moraes, F.; Manuel Proença, H.; Kornilova, A.; Albert, J.; Goldenberg, D. Uplift Modeling: From Causal Inference to Personalization. In Proceedings of the Proceedings of the 32nd ACM International Conference on Information and Knowledge Management, Birmingham, UK, 21–25 October 2023; pp. 5212–5215. [Google Scholar] [CrossRef]
- De Caigny, A.; Coussement, K.; Verbeke, W.; Idbenjra, K.; Phan, M. Uplift modeling and its implications for B2B customer churn prediction: A segmentation-based modeling approach. Ind. Mark. Manag. 2021, 99, 28–39. [Google Scholar] [CrossRef]
- Singh, S.S.K.; Kumar Sinha, A.; Pandey, T.N.; Acharya, B.M. A Machine Learning Approach to Compare Causal Inference Modelling Strategies in the Digital Advertising Industry. In Proceedings of the 2023 2nd International Conference on Ambient Intelligence in Health Care (ICAIHC), Bhubaneswar, India, 17–18 November 2023; pp. 1–7. [Google Scholar] [CrossRef]
- Wager, S.; Athey, S. Estimation and Inference of Heterogeneous Treatment Effects using Random Forests. J. Am. Stat. Assoc. 2018, 113, 1228–1242. [Google Scholar] [CrossRef]
- Künzel, S.R.; Sekhon, J.S.; Bickel, P.J.; Yu, B. Metalearners for estimating heterogeneous treatment effects using machine learning. Proc. Natl. Acad. Sci. USA 2019, 116, 4156–4165. [Google Scholar] [CrossRef]
- Chernozhukov, V.; Chetverikov, D.; Demirer, M.; Duflo, E.; Hansen, C.; Newey, W.; Robins, J. Double/debiased machine learning for treatment and structural parameters. Econom. J. 2018, 21, C1–C68. [Google Scholar] [CrossRef]
- Bibi, N.; Ahsan, A.; Anwar, Z. Project resource allocation optimization using search based software engineering—A framework. In Proceedings of the Ninth International Conference on Digital Information Management (ICDIM 2014), Phitsanulok, Thailand, 29 September–1 October 2014; pp. 226–229. [Google Scholar] [CrossRef]
- Yoshimura, M.; Fujimi, Y.; Izui, K.; Nishiwaki, S. Decision-making support system for human resource allocation in product development projects. Int. J. Prod. Res. 2006, 44, 831–848. [Google Scholar] [CrossRef]
- Certa, A.; Enea, M.; Galante, G.; Manuela La Fata, C. Multi-objective human resources allocation in R&D projects planning. Int. J. Prod. Res. 2009, 47, 3503–3523. [Google Scholar] [CrossRef]
- Hasan, R.; Dattana, V.; Mahmood, S. Dialectical search: A cognitively inspired framework for balancing solution quality and computational cost in global optimization. J. Umm Al-Qura Univ. Eng. Archit. 2025, 1–15. [Google Scholar] [CrossRef]
- Alsheref, F.K.; Fattoh, I.E.; M.Ead, W. Automated Prediction of Employee Attrition Using Ensemble Model Based on Machine Learning Algorithms. Comput. Intell. Neurosci. 2022, 2022, 7728668. [Google Scholar] [CrossRef] [PubMed]
- Jung, Y.; Tian, J.; Bareinboim, E. Estimating Identifiable Causal Effects through Double Machine Learning. In Proceedings of the AAAI Conference on Artificial Intelligence, Virtual, 19–21 May 2021; Volume 35, pp. 12113–12122. [Google Scholar] [CrossRef]







| Step | Phase | Function | Explanation |
|---|---|---|---|
| 1 | Attrition Risk Stratification | Mrisk(xe) → re | Train a predictive model such as Random Forest to predict the probability of attrition (re) for each employee (e). |
| 2 | Causal Effect Estimation | PSM (Y, T, X) → ATEk | Use PSM to estimate the ATE for each potential intervention (k). |
| 3 | Dialectical Optimization | Optimize (re, ATE(we), B) → π(e) | Apply a dialectical search heuristic to assign the best affordable interventions (π) to high-priority employees under a budget (B), weighting by the Employee Importance Score (we) |
| 4 | Consistent Feature Analysis | CFCA (φrisk, φcausal) → Φ | Reconcile SHapley Additive exPlanations (SHAP) values (φ) from the risk model and the propensity score model to identify a set of consistently important features (Φ). |
| Intervention | Mean Effect (ATE) | Lower CI | Upper CI | Significance |
|---|---|---|---|---|
| Bonus | −0.012666 | −0.053916 | 0.028585 | Not Significant |
| Promotion | 0.108416 | 0.048126 | 0.168705 | Significant |
| Training | −0.040247 | −0.082938 | 0.002443 | Not Significant |
| Model | AUC Score |
|---|---|
| k-NN | 0.6008 |
| Random Forest (Primary) | 0.5653 |
| Intervention | ATE |
|---|---|
| Promotion | −0.2393 |
| Compensation (Bonus) | −0.0555 |
| Metric | Value |
|---|---|
| Total Budget Allocated | $100,000.00 |
| Total Employees Targeted | 10 |
| Primary Intervention Assigned | Promotion |
| Gender | Population Share (%) | Budget Share (%) |
|---|---|---|
| Female | 56.93 | 20.00 |
| Male | 43.07 | 80.00 |
| Framework Version | Allocation Strategy | Budget Used | Employees Treated | Total Attrition Reduction | Risk-Weighted Attrition Reduction |
|---|---|---|---|---|---|
| Scenario A | Risk-Prioritised + Cheapest | $97,500 | 13 | 0.7211 | 0.7108 |
| Full CDO Framework | Risk-Prioritised + Causal ROI | $100,000 | 10 | 2.3929 | 2.3402 |
| Scenario B | ROI-Only (No Risk Priority) | $100,000 | 10 | 2.3929 | 1.7270 |
| Scenario C | Risk-Prioritised + Raw Uplift | $100,000 | 10 | 2.3929 | 2.3402 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Alyousef, M.I.; Sattar, M.U.; Hasan, R.; Usman, S.; Hassan, A. The Counterfactual–Dialectical Optimization Framework: A Prescriptive Approach to Employee Attrition Management with Empirical Validation. Information 2025, 16, 1053. https://doi.org/10.3390/info16121053
Alyousef MI, Sattar MU, Hasan R, Usman S, Hassan A. The Counterfactual–Dialectical Optimization Framework: A Prescriptive Approach to Employee Attrition Management with Empirical Validation. Information. 2025; 16(12):1053. https://doi.org/10.3390/info16121053
Chicago/Turabian StyleAlyousef, Muna I., Mian Usman Sattar, Raza Hasan, Snober Usman, and Atif Hassan. 2025. "The Counterfactual–Dialectical Optimization Framework: A Prescriptive Approach to Employee Attrition Management with Empirical Validation" Information 16, no. 12: 1053. https://doi.org/10.3390/info16121053
APA StyleAlyousef, M. I., Sattar, M. U., Hasan, R., Usman, S., & Hassan, A. (2025). The Counterfactual–Dialectical Optimization Framework: A Prescriptive Approach to Employee Attrition Management with Empirical Validation. Information, 16(12), 1053. https://doi.org/10.3390/info16121053

