Reinforcement Learning in a New Keynesian Model^{ †}

## Abstract

## 1. Introduction

## 2. The Standard Behavioural NK Model

#### 2.1. The Workhorse NK Model

#### 2.2. The Brock–Hommes Behavioural NK Model

## 3. The Non-Linear NK Model

#### 3.1. Households

#### 3.2. Firms, Government Expenditures and Monetary Policy

#### 3.3. Recovering the NK Workhorse Model

## 4. AU Learning and Market-Consistent Information

## 5. Heterogeneous Expectations across Agents

#### 5.1. Exogenous Proportions of RE and BR Agents

#### 5.2. Endogenous Proportions of RE and BR Agents with Reinforcement Learning

#### 5.3. The Possibility of Bifurcation and Chaotic Dynamics

## 6. Conclusions

## Author Contributions

## Funding

## Data Availability Statement

## Conflicts of Interest

## Appendix A. Stability Analysis

**Figure A1.**Comparison of stability properties of RE, EL and AU models in $({\rho}_{r},{\theta}_{\pi})$ space; ${\rho}_{r}>0$, ${\lambda}_{x}=1$; red: determinancy; black: indeterminacy; green: instability. (

**a**) RE: ${\theta}_{y}=0.3$; (

**b**) EL: ${\theta}_{y}=0.3$; (

**c**) AU: ${\theta}_{y}=0.3$.

**Figure A2.**Comparisonof stability properties of EL and AU models in $({\theta}_{y},{\theta}_{\pi})$ space; ${\rho}_{r}=1$, ${\lambda}_{x}=1$; red: determinancy; black: indeterminacy; green: instability. (

**a**) EL: ${\rho}_{r}=1$; (

**b**) AU: ${\rho}_{r}=1$.

## Appendix B. Summary of Composite RE-BR Model

**RE Households:**

**BR Households:**

**Wholesale Firms:**

**RE Retail Firms:**

**BR Retail Firms:**

**One-Period Ahead Adaptive Expectations:**

**Wealth Distribution:**

**Closure of Model:**

**Endogenous Proportions of RE and BR Agents:**

**Welfare and Consumption Equivalence:**

## Appendix C. Balanced Growth Steady State

## Appendix D. Lemma

**Lemma**

**A1.**

**Proof.**

## Appendix E

**Proof**

**of**

**Equation**

**(28).**

## Appendix F. Additional Simulated IRFs for RE-BR Composite Models

**Figure A3.**RE versus RE-BR composite expectations with ${n}_{h}={n}_{f}=0.5$; ${\lambda}_{x}=0.25,1.0$; Taylor rule with ${\rho}_{r}=0.7$, ${\theta}_{\pi}=1.5$ and ${\theta}_{y}=0.3$; technology shock.

**Figure A4.**RE versus RE-BR composite expectations with ${n}_{h}={n}_{f}=0.5$; ${\lambda}_{x}=0.25,1.0$; Taylor rule with ${\rho}_{r}=0.7$, ${\theta}_{\pi}=1.5$ and ${\theta}_{y}=0.3$; mark-up shock.

## Appendix G. Robustness

**Table A1.**Third-order solution of the estimated NK RE-BR model; ${\mu}_{h}^{RE}={\mu}_{h}^{BR}={\mu}_{f}^{RE}={\mu}_{f}^{BR}=0.5$; $\gamma =1,100,1000$.

Variable | Stochastic Mean | Standard Deviation (%) | Skewness | Kurtosis |
---|---|---|---|---|

$\frac{{C}_{t}}{C}$ | 0.999544 | 0.042057 | 0.323304 | 0.093034 |

$\frac{{H}_{t}}{H}$ | 1.000273 | 0.005111 | 0.038002 | −0.020743 |

$\frac{{W}_{t}}{W}$ | 0.999810 | 0.038145 | 0.318586 | 0.073488 |

$\frac{{\Pi}_{t}}{\Pi}$ | 0.999898 | 0.004235 | −0.045800 | 0.030136 |

$\frac{{R}_{n,t}}{{R}_{n}}$ | 0.999887 | 0.004440 | −0.046254 | 0.044145 |

${\Phi}_{h,t}^{RE}-{C}_{h}$ | −0.000443 | 0.000257 | −1.504159 | 3.793195 |

${\Phi}_{h,t}^{AE}$ | −0.000526 | 0.000303 | −1.592581 | 4.581412 |

${\Phi}_{f,t}^{RE}-{C}_{f}$ | −0.000199 | 0.000116 | −1.672777 | 5.558457 |

${\Phi}_{f,t}^{AE}$ | −0.000349 | 0.000226 | −1.897335 | 7.457836 |

${n}_{h,t}\phantom{\rule{3.33333pt}{0ex}}(\gamma =1;\sigma =1)$ | 0.100008 | 0.000013 | 0.488774 | 3.275592 |

${n}_{f,t}\phantom{\rule{3.33333pt}{0ex}}(\gamma =1;\sigma =1)$ | 0.100014 | 0.000016 | 1.680492 | 6.480563 |

${n}_{h,t}\phantom{\rule{3.33333pt}{0ex}}(\gamma =100;\sigma =1)$ | 0.100750 | 0.001295 | 0.488774 | 3.275592 |

${n}_{f,t}\phantom{\rule{3.33333pt}{0ex}}(\gamma =100;\sigma =1)$ | 0.101352 | 0.001568 | 1.680492 | 6.480563 |

${n}_{h,t}\phantom{\rule{3.33333pt}{0ex}}(\gamma =1000;\sigma =1)$ | 0.107502 | 0.012952 | 0.488774 | 3.275592 |

${n}_{f,t}\phantom{\rule{3.33333pt}{0ex}}(\gamma =1000;\sigma =1)$ | 0.113519 | 0.015679 | 1.680492 | 6.480563 |

${n}_{h,t}\phantom{\rule{3.33333pt}{0ex}}(\gamma =1000;\sigma =2)$ | 0.130010 | 0.052873 | 0.535046 | 3.638229 |

${n}_{f,t}\phantom{\rule{3.33333pt}{0ex}}(\gamma =1000;\sigma =2)$ | 0.154185 | 0.063624 | 1.779321 | 7.399916 |

**Table A2.**Third-order solution of the estimated NK RE-BR model; ${\mu}_{h}^{RE}={\mu}_{h}^{BR}={\mu}_{f}^{RE}={\mu}_{f}^{BR}=0.75$; $\gamma =1,100,1000$.

Variable | Stochastic Mean | Standard Deviation (%) | Skewness | Kurtosis |
---|---|---|---|---|

$\frac{{C}_{t}}{C}$ | 0.999544 | 0.042057 | 0.323304 | 0.093034 |

$\frac{{H}_{t}}{H}$ | 1.000273 | 0.005111 | 0.038002 | −0.020743 |

$\frac{{W}_{t}}{W}$ | 0.999810 | 0.038145 | 0.318586 | 0.073488 |

$\frac{{\Pi}_{t}}{\Pi}$ | 0.999898 | 0.004235 | −0.045797 | 0.030137 |

$\frac{{R}_{n,t}}{{R}_{n}}$ | 0.999887 | 0.004440 | −0.046251 | 0.044145 |

${\Phi}_{h,t}^{RE}-{C}_{h}$ | −0.000443 | 0.000170 | −0.978598 | 1.538134 |

${\Phi}_{h,t}^{AE}$ | −0.000526 | 0.000204 | −1.088202 | 2.164231 |

${\Phi}_{f,t}^{RE}-{C}_{f}$ | −0.000199 | 0.000077 | −1.063312 | 2.243911 |

${\Phi}_{f,t}^{AE}$ | −0.000349 | 0.000159 | −1.414287 | 4.290569 |

${n}_{h,t}\phantom{\rule{3.33333pt}{0ex}}(\gamma =1;\sigma =1)$ | 0.100008 | 0.000008 | 0.350716 | 2.281134 |

${n}_{f,t}\phantom{\rule{3.33333pt}{0ex}}(\gamma =1;\sigma =1)$ | 0.100014 | 0.000011 | 1.385635 | 4.243151 |

${n}_{h,t}\phantom{\rule{3.33333pt}{0ex}}(\gamma =100;\sigma =1)$ | 0.100750 | 0.000821 | 0.350716 | 2.281134 |

${n}_{f,t}\phantom{\rule{3.33333pt}{0ex}}(\gamma =100;\sigma =1)$ | 0.101352 | 0.001081 | 1.385635 | 4.243151 |

${n}_{h,t}\phantom{\rule{3.33333pt}{0ex}}(\gamma =1000;\sigma =1)$ | 0.107503 | 0.008211 | 0.350716 | 2.281134 |

${n}_{f,t}\phantom{\rule{3.33333pt}{0ex}}(\gamma =1000;\sigma =1)$ | 0.113521 | 0.010812 | 1.385635 | 4.243151 |

${n}_{h,t}\phantom{\rule{3.33333pt}{0ex}}(\gamma =1000;\sigma =2)$ | 0.130012 | 0.033699 | 0.406619 | 2.557592 |

${n}_{f,t}\phantom{\rule{3.33333pt}{0ex}}(\gamma =1000;\sigma =2)$ | 0.154191 | 0.044060 | 1.491071 | 4.993491 |

**Figure 1.**RE versus RE-BR composite expectations with ${n}_{h}={n}_{f}=0.5$, ${\lambda}_{x}=0.25,1.0$; Taylor rule with ${\rho}_{r}=0.7$, ${\theta}_{\pi}=1.5$ and ${\theta}_{y}=0.3$, ${\theta}_{dy}=0$; monetary policy shock.

**Table 1.**Third-order solution of the estimated NK RE-BR model; ${\mu}_{h}^{RE}={\mu}_{h}^{BR}={\mu}_{f}^{RE}={\mu}_{f}^{BR}=0$; $\gamma =1,100,1000$.

Variable | Stochastic Mean | Standard Deviation (%) | Skewness | Kurtosis |
---|---|---|---|---|

$\frac{{C}_{t}}{C}$ | 0.999544 | 0.042057 | 0.323304 | 0.093034 |

$\frac{{H}_{t}}{H}$ | 1.000273 | 0.005111 | 0.038002 | −0.020743 |

$\frac{{W}_{t}}{W}$ | 0.999810 | 0.038145 | 0.318586 | 0.073488 |

$\frac{{\Pi}_{t}}{\Pi}$ | 0.999898 | 0.004235 | −0.045800 | 0.030136 |

$\frac{{R}_{n,t}}{{R}_{n}}$ | 0.999887 | 0.004440 | −0.046254 | 0.044145 |

${\Phi}_{h,t}^{RE}-{C}_{h}$ | −0.000443 | 0.000446 | −2.078809 | 6.635580 |

${\Phi}_{h,t}^{AE}$ | −0.000526 | 0.000516 | −2.168947 | 8.000489 |

${\Phi}_{f,t}^{RE}-{C}_{f}$ | −0.000199 | 0.000203 | −2.279557 | 9.082031 |

${\Phi}_{f,t}^{AE}$ | −0.000349 | 0.000342 | −2.269953 | 9.937975 |

${n}_{h,t}\phantom{\rule{3.33333pt}{0ex}}(\gamma =1;\sigma =1)$ | 0.100008 | 0.000023 | 0.857638 | 4.454288 |

${n}_{f,t}\phantom{\rule{3.33333pt}{0ex}}(\gamma =1;\sigma =1)$ | 0.100014 | 0.000025 | 1.586194 | 6.015115 |

${n}_{h,t}\phantom{\rule{3.33333pt}{0ex}}(\gamma =100;\sigma =1)$ | 0.100750 | 0.002297 | 0.857638 | 4.454288 |

${n}_{f,t}\phantom{\rule{3.33333pt}{0ex}}(\gamma =100;\sigma =1)$ | 0.101352 | 0.002479 | 1.586194 | 6.015115 |

${n}_{h,t}\phantom{\rule{3.33333pt}{0ex}}(\gamma =1000;\sigma =1)$ | 0.107501 | 0.022973 | 0.857638 | 4.454288 |

${n}_{f,t}\phantom{\rule{3.33333pt}{0ex}}(\gamma =1000;\sigma =1)$ | 0.113518 | 0.024787 | 1.586194 | 6.015115 |

${n}_{h,t}\phantom{\rule{3.33333pt}{0ex}}(\gamma =1000;\sigma =2)$ | 0.130007 | 0.093482 | 0.888592 | 4.857691 |

${n}_{f,t}\phantom{\rule{3.33333pt}{0ex}}(\gamma =1000;\sigma =2)$ | 0.154182 | 0.100265 | 1.683430 | 6.867599 |

