Analyzing Strategic Parental Leave Decisions Using Two-Player Multi-Agent Reinforcement Learning
Abstract
1. Introduction
- In a competitive environment, does the presence of an additional career penalty mechanism discourage employees from taking parental leave?
- How does the magnitude of the career penalty affect the employees’ optimal strategies?
- Is an employee’s leave decision influenced by the other employee’s decision? For example, if one employee does not take parental leave, is the other more likely to forgo leave to maintain a competitive advantage?
- It introduces a game-theoretic framework that models the strategic decision-making process of two employees regarding parental leave, accounting for both individual career goals and inter-agent competition;
- It identifies optimal strategies for employees under varying conditions of income replacement and career penalties using an SG model and the Nash Q-learning algorithm;
- It explores how changes in income replacement and career penalties affect parental leave decisions, with implications for organizational policy design.
2. Literature Review
3. Mathematical Model
3.1. Preliminary: Stochastic Game (SG) Model
3.2. Problem Description
3.3. Formulation
- State . The state at year consists of each employee’s individual state . Here, is the employee’s current job position; is the number of years spent in the current position; is an indicator of whether the employee is eligible for parental leave, where means that the employee has not yet used parental leave and remains eligible, and means that the employee has already used parental leave and is no longer eligible; and is the child’s age. If , the employee is no longer eligible to take parental leave, even if it has not been used previously.
- Joint action . Employees can choose either to continue working (W) or to take one year of parental leave (L); , if they are eligible to take parental leave. If an employee has already used parental leave () or the child has reached the maximum eligible age (), then the employee must work; i.e., .
- Individual reward function . The reward received by an employee in each year depends on the employee’s current position and the chosen action .where denotes the government subsidy provided to an employee when they take parental leave, based on the annual salary , and represents the employee’s perceived utility from taking one year of parental leave. Although the perceived utility value captures non-monetary benefits of taking parental leave, we assume that is defined on a cardinal scale that is commensurate with income, thereby allowing algebraic aggregation. Accordingly, this value should be interpreted in monetary-equivalent terms rather than as a direct psychological measure.
- Transition probability . If , the next individual state is determined with probability one. The job position remains unchanged; increases by 1, up to the maximum ; becomes 0; and increases by 1, up to . Otherwise, if , the employee may be promoted to the next position with promotion probability provided that the years of service at the current position meet the promotion eligibility condition, i.e., (). If the employee is not promoted or is outside the promotion-eligible years, the state transition follows the same logic in the case of taking parental leave, except that remains unchanged.
- Discount factor . In this model, reflects an annual interest rate.
- Time horizon . T denotes the maximum number of service years of an employee.
4. Algorithm
4.1. Preliminary: Nash Q-Learning
4.2. Accelerating Convergence via Optimistic Initialization and Backward Iteration
| Algorithm 1 Nash Q-Learning with optimistic initialization and backward iteration. |
|
5. Numerical Experiment
5.1. Experimental Settings
5.2. Results and Discussion
5.2.1. Results Under Equal Utility Values
5.2.2. Results Under Different Utility Values
6. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
Appendix A
Appendix A.1. Single-Agent MDP Model

Appendix A.2. Full Results of Experiments
| Parameters | Perceived Utility Level () | |||||
|---|---|---|---|---|---|---|
| 0 M | 33 M | 41.5 M | 50 M | 100 M | 200 M | |
| 0.00 ± 0.00 0.00 ± 0.00 | 78.34 ± 0.79 78.12 ± 0.71 | 78.76 ± 0.79 78.52 ± 0.71 | 84.08 ± 1.08 84.15 ± 1.02 | 93.22 ± 0.71 93.25 ± 0.66 | 98.61 ± 0.41 98.62 ± 0.38 | |
| 0.00 ± 0.00 0.01 ± 0.02 | 45.93 ± 1.19 45.41 ± 1.07 | 57.30 ± 1.22 57.00 ± 1.07 | 78.85 ± 1.68 79.19 ± 1.57 | 93.15 ± 0.73 93.25 ± 0.66 | 98.34 ± 0.51 98.36 ± 0.46 | |
| 0.00 ± 0.01 0.01 ± 0.02 | 31.79 ± 1.31 31.10 ± 1.23 | 40.74 ± 1.40 40.55 ± 1.36 | 60.94 ± 1.61 61.40 ± 1.61 | 91.64 ± 1.17 91.85 ± 1.05 | 98.37 ± 0.49 98.44 ± 0.44 | |
| 0.00 ± 0.01 0.01 ± 0.02 | 76.31 ± 0.97 76.00 ± 0.88 | 78.56 ± 0.80 78.34 ± 0.71 | 84.08 ± 1.08 84.15 ± 1.02 | 93.22 ± 0.71 93.25 ± 0.66 | 98.54 ± 0.43 98.55 ± 0.40 | |
| 0.02 ± 0.02 0.02 ± 0.02 | 75.09 ± 1.08 74.89 ± 0.99 | 76.58 ± 1.02 76.37 ± 0.91 | 83.54 ± 1.18 83.70 ± 1.09 | 93.22 ± 0.71 93.25 ± 0.66 | 99.15 ± 0.24 99.13 ± 0.24 | |
| 0.50 ± 0.16 0.58 ± 0.16 | 44.34 ± 1.31 43.97 ± 1.23 | 56.12 ± 1.30 55.85 ± 1.14 | 78.24 ± 1.73 78.58 ± 1.58 | 92.95 ± 0.80 93.07 ± 0.71 | 98.84 ± 0.36 98.88 ± 0.33 | |
| 0.50 ± 0.16 0.56 ± 0.16 | 45.08 ± 1.28 45.04 ± 1.23 | 55.60 ± 1.31 55.35 ± 1.15 | 77.71 ± 1.78 78.00 ± 1.63 | 92.91 ± 0.80 93.03 ± 0.71 | 98.14 ± 0.56 98.17 ± 0.51 | |
| Parameters | Perceived Utility Combination | |||||
|---|---|---|---|---|---|---|
| 0 M 33 M | 0 M 50 M | 0 M 100 M | 33 M 50 M | 33 M 100 M | 100 M 50 M | |
| 0.26 ± 0.07 78.00 ± 0.69 | 0.17 ± 0.06 84.15 ± 1.02 | 0.06 ± 0.04 93.55 ± 0.59 | 78.21 ± 0.78 84.15 ± 1.02 | 78.36 ± 0.80 93.45 ± 0.62 | 93.22 ± 0.71 84.12 ± 0.88 | |
| 0.47 ± 0.11 60.78 ± 1.10 | 0.01 ± 0.01 84.15 ± 1.02 | 0.01 ± 0.01 93.65 ± 0.58 | 78.31 ± 0.80 84.15 ± 1.02 | 78.20 ± 0.78 93.25 ± 0.66 | 93.22 ± 0.71 84.14 ± 1.00 | |
| 0.01 ± 0.01 53.09 ± 1.30 | 0.06 ± 0.03 83.20 ± 1.14 | 0.14 ± 0.05 93.37 ± 0.63 | 77.27 ± 0.89 83.42 ± 1.10 | 78.18 ± 0.77 93.31 ± 0.65 | 93.27 ± 0.70 84.15 ± 0.99 | |
| 0.06 ± 0.04 43.99 ± 1.22 | 0.03 ± 0.02 79.42 ± 1.56 | 0.15 ± 0.06 93.08 ± 0.71 | 44.47 ± 1.28 79.51 ± 1.49 | 43.22 ± 1.18 93.29 ± 0.65 | 93.47 ± 0.65 78.83 ± 1.48 | |
| 0.06 ± 0.04 40.06 ± 1.22 | 0.40 ± 1.14 70.80 ± 1.65 | 0.63 ± 0.20 92.89 ± 0.77 | 43.61 ± 1.25 70.66 ± 1.70 | 44.12 ± 1.14 93.25 ± 0.67 | 92.51 ± 0.92 79.83 ± 1.49 | |
| 0.72 ± 0.16 35.32 ± 1.20 | 0.12 ± 0.05 67.39 ± 1.76 | 0.18 ± 0.06 92.18 ± 0.95 | 45.27 ± 1.28 68.71 ± 1.67 | 43.29 ± 1.17 92.44 ± 0.89 | 92.40 ± 0.94 79.16 ± 1.52 | |
Appendix A.3. Pseudo-Code of Nash Q-Learning
| Algorithm A1 Nash Q-Learning |
|
Appendix A.4. Ablation Results
| Setting | Convergence Speed |
|---|---|
| vanilla Nash Q-learning | 3668 |
| Nash Q-learning + backward iteration | 3452 |
| Nash Q-learning + optimistic init | 1116 |
| Nash Q-learning + optimistic init + backward iteration | 1013 |
Appendix B. Code Availability
References
- Haas, L.; Hwang, C.P. The impact of taking parental leave on fathers’ participation in childcare and relationships with children: Lessons from Sweden. Community Work. Fam. 2008, 11, 85–104. [Google Scholar] [CrossRef]
- O’Brien, M. Fathers, parental leave policies, and infant quality of life: International perspectives and policy impact. Ann. Am. Acad. Political Soc. Sci. 2009, 624, 190–213. [Google Scholar] [CrossRef]
- Bünning, M. What happens after the ‘daddy months’? Fathers’ involvement in paid work, childcare, and housework after taking parental leave in Germany. Eur. Sociol. Rev. 2015, 31, 738–748. [Google Scholar] [CrossRef]
- Heshmati, A.; Honkaniemi, H.; Juárez, S.P. The effect of parental leave on parents’ mental health: A systematic review. Lancet Public Health 2023, 8, e57–e75. [Google Scholar] [CrossRef] [PubMed]
- Burtle, A.; Bezruchka, S. Population health and paid parental leave: What the United States can learn from two decades of research. Healthcare 2016, 4, 30. [Google Scholar] [CrossRef]
- Tanaka, S. Parental leave and child health across OECD countries. Econ. J. 2005, 115, F7–F28. [Google Scholar] [CrossRef]
- Baum, C.L.; Ruhm, C.J. The effects of paid family leave in California on labor market outcomes. J. Policy Anal. Manag. 2016, 35, 333–356. [Google Scholar] [CrossRef]
- Thévenon, O.; Solaz, A. Labour Market Effects of Parental Leave Policies in OECD Countries; OECD Social, Employment and Migration Working Papers; OECD: Paris, France, 2013. [Google Scholar]
- Doucet, A.; McKay, L. Fathering, parental leave, impacts, and gender equality: What/how are we measuring? Int. J. Sociol. Soc. Policy 2020, 40, 441–463. [Google Scholar] [CrossRef]
- Bastani, S.; Blumkin, T.; Micheletto, L. The welfare-enhancing role of parental leave mandates. J. Law Econ. Organ. 2019, 35, 77–126. [Google Scholar] [CrossRef]
- Carlsson, M.; Reshid, A.A. Co-worker peer effects on parental leave take-up. Scand. J. Econ. 2022, 124, 930–957. [Google Scholar]
- Akerlof, G.A.; Kranton, R.E. Economics and identity. Q. J. Econ. 2000, 115, 715–753. [Google Scholar] [CrossRef]
- Haas, L.; Allard, K.; Hwang, P. The impact of organizational culture on men’s use of parental leave in Sweden. Community Work. Fam. 2002, 5, 319–342. [Google Scholar] [CrossRef]
- Petts, R.J.; Mize, T.D.; Kaufman, G. Organizational policies, workplace culture, and perceived job commitment of mothers and fathers who take parental leave. Soc. Sci. Res. 2022, 103, 102651. [Google Scholar] [CrossRef] [PubMed]
- McKay, L.; Doucet, A. “Without taking away her leave”: A canadian case study of couples’decisions on fathers’use of paid parental leave. Fathering 2010, 8, 300. [Google Scholar] [CrossRef]
- Romero-Balsas, P.; Muntanyola-Saura, D.; Rogero-García, J. Decision-making factors within paternity and parental leaves: Why Spanish fathers take time off from work. Gender Work. Organ. 2013, 20, 678–691. [Google Scholar] [CrossRef]
- Lee, Y. ‘Undoing gender’or selection effects?: Fathers’ uptake of leave and involvement in housework and childcare in South Korea. J. Fam. Stud. 2023, 29, 2430–2458. [Google Scholar] [CrossRef]
- Lee, Y. Norms about childcare, working hours, and fathers’ uptake of parental leave in South Korea. Community Work. Fam. 2023, 26, 466–491. [Google Scholar] [CrossRef]
- McLeroy, K.R.; Bibeau, D.; Steckler, A.; Glanz, K. An ecological perspective on health promotion programs. Health Educ. Q. 1988, 15, 351–377. [Google Scholar] [CrossRef]
- Hu, J.; Wellman, M.P. Nash Q-learning for general-sum stochastic games. J. Mach. Learn. Res. 2003, 4, 1039–1069. [Google Scholar]
- Ding, Z.W.; Zheng, G.Z.; Cai, C.R.; Cai, W.R.; Chen, L.; Zhang, J.Q.; Wang, X.M. Emergence of cooperation in two-agent repeated games with reinforcement learning. Chaos Solitons Fractals 2023, 175, 114032. [Google Scholar] [CrossRef]
- Leslie, D.S.; Collins, E.J. Individual Q-learning in normal form games. SIAM J. Control Optim. 2005, 44, 495–514. [Google Scholar] [CrossRef]
- Kaufman, G. Barriers to equality: Why British fathers do not use parental leave. Community Work. Fam. 2018, 21, 310–325. [Google Scholar] [CrossRef]
- Jørgensen, T.H.; Søgaard, J.E. Welfare Reforms and the Division of Parental Leave. 2021. Available online: https://ssrn.com/abstract=3831467 (accessed on 1 February 2026).
- Meil, G.; García Sainz, C.; Luque, M.; Ayuso, L. El Impacto de los Permisos Parentales en la Carrera Profesional; Universidad Autónoma de Madrid: Madrid, Spain, 2007. [Google Scholar]
- Haas, L.; Hwang, P.O. Company Culture and Men’s Usage of Family Leave Benefits in Sweden. Fam. Relat. 1995, 44, 28–36. [Google Scholar]
- Fried, M. Taking Time; Temple University Press: Philadelphia, PA, USA, 1998; Volume 9. [Google Scholar]
- Margolis, R.; Hou, F.; Haan, M.; Holm, A. Use of parental benefits by family income in Canada: Two policy changes. J. Marriage Fam. 2019, 81, 450–467. [Google Scholar]
- Schneer, J.A.; Reitman, F. The interrupted managerial career path: A longitudinal study of MBAs. J. Vocat. Behav. 1997, 51, 411–434. [Google Scholar] [CrossRef]
- Judiesch, M.K.; Lyness, K.S. Left behind? The impact of leaves of absence on managers’ career success. Acad. Manag. J. 1999, 42, 641–651. [Google Scholar]
- Evertsson, M. Parental leave and careers: Women’s and men’s wages after parental leave in Sweden. Adv. Life Course Res. 2016, 29, 26–40. [Google Scholar] [CrossRef]
- Tô, L.T. The Signaling Role of Parental Leave. Ph.D. Thesis, Harvard University, Cambridge, MA, USA, 2018. [Google Scholar]
- Chatterji, P.; Markowitz, S. Family Leave After Childbirth and the Health of New Mothers; Working Paper 14156; National Bureau of Economic Research: Cambridge, MA, USA, 2008. [Google Scholar] [CrossRef]
- Van Niel, M.S.; Bhatia, R.; Riano, N.S.; De Faria, L.; Catapano-Friedman, L.; Ravven, S.; Weissman, B.; Nzodom, C.; Alexander, A.; Budde, K.; et al. The impact of paid maternity leave on the mental and physical health of mothers and children: A review of the literature and policy implications. Harv. Rev. Psychiatry 2020, 28, 113–126. [Google Scholar] [CrossRef]
- Ruhm, C.J. Parental leave and child health. J. Health Econ. 2000, 19, 931–960. [Google Scholar]
- Danzer, N.; Halla, M.; Schneeweis, N.; Zweimüller, M. Parental leave, (in)formal childcare, and long-term child outcomes. J. Hum. Resour. 2022, 57, 1826–1884. [Google Scholar] [CrossRef]
- Huber, K. Changes in parental leave and young children’s non-cognitive skills. Rev. Econ. Househ. 2019, 17, 89–119. [Google Scholar] [CrossRef]
- Liu, Q.; Skans, O.N. The Duration of Paid Parental Leave and Children’s Scholastic Performance. B.E. J. Econ. Anal. Policy 2010, 10, 3. [Google Scholar] [CrossRef]
- Nick, J.M.; Sahin, S.; Roberts, L.R.; Hatton, A.; Cafferky, B. Effect of paternity leave or fathers’ parental leave on infant health: A systematic review protocol. JBI Evid. Synth. 2025, 23, 792–800. [Google Scholar] [CrossRef] [PubMed]
- del Carmen Huerta, M.; Adema, W.; Baxter, J.; Han, W.J.; Lausten, M.; Lee, R.; Waldfogel, J. Fathers’ Leave, Fathers’ Involvement and Child Development: Are They Related? Evidence from Four OECD Countries; OECD: Paris, France, 2013. [Google Scholar]
- Ekberg, J.; Eriksson, R.; Friebel, G. Parental leave—A policy evaluation of the Swedish “Daddy-Month” reform. J. Public Econ. 2013, 97, 131–143. [Google Scholar] [CrossRef]
- Patnaik, A. Reserving time for daddy: The consequences of fathers’ quotas. J. Labor Econ. 2019, 37, 1009–1059. [Google Scholar] [CrossRef]
- Castro-García, C.; Pazos-Moran, M. Parental leave policy and gender equality in Europe. Fem. Econ. 2016, 22, 51–73. [Google Scholar] [CrossRef]
- Shapley, L.S. Stochastic Games. Proc. Natl. Acad. Sci. USA 1953, 39, 1095–1100. [Google Scholar] [CrossRef]
- Lemke, C.E.; Howson, J.T., Jr. Equilibrium points of bimatrix games. J. Soc. Ind. Appl. Math. 1964, 12, 413–423. [Google Scholar] [CrossRef]
- Ministry of Employment and Labor of Korea. Wage and Job Information System; Ministry of Employment and Labor of Korea: Sejong-si, Republic of Korea, 2024.
- Joo, M. Wage System and the Workforce Management Survey Report of 2023; Technical Report; Ministry of Employment and Labor of Korea: Sejong-si, Republic of Korea, 2023.





| Position x | Staff () | Assistant Manager () | Manager () | Senior Manager () | Director () |
|---|---|---|---|---|---|
| Annual salary (KRW) | 30 M | 36 M | 47 M | 53 M | 66 M |
| Minimum service years | 3 | 4 | 4 | 5 | 5 |
| Base promotion probability | 1.0 | 0.65 | 0.45 | 0.35 | – |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Zhao, L.; Lee, H.-R. Analyzing Strategic Parental Leave Decisions Using Two-Player Multi-Agent Reinforcement Learning. Systems 2026, 14, 217. https://doi.org/10.3390/systems14020217
Zhao L, Lee H-R. Analyzing Strategic Parental Leave Decisions Using Two-Player Multi-Agent Reinforcement Learning. Systems. 2026; 14(2):217. https://doi.org/10.3390/systems14020217
Chicago/Turabian StyleZhao, Lixue, and Hyun-Rok Lee. 2026. "Analyzing Strategic Parental Leave Decisions Using Two-Player Multi-Agent Reinforcement Learning" Systems 14, no. 2: 217. https://doi.org/10.3390/systems14020217
APA StyleZhao, L., & Lee, H.-R. (2026). Analyzing Strategic Parental Leave Decisions Using Two-Player Multi-Agent Reinforcement Learning. Systems, 14(2), 217. https://doi.org/10.3390/systems14020217

