# Deep Learning and Mean-Field Games: A Stochastic Optimal Control Perspective

^{1}

^{2}

^{*}

## Abstract

**:**

## 1. Introduction

## 2. Problem Formulation and Preliminaries

#### 2.1. Wasserstein Metrics

#### 2.2. Stochastic Optimal Control Problem

- a controlled state variable ${\left({X}_{t}^{\alpha}\right)}_{t\in [0,T]}$, where ${X}_{t}$ is an i.i.d. sequence of ${\mathbb{R}}^{n}$-valued ${\mathcal{F}}_{0}$-measurable random variables;
- a sequence ${\left\{{W}_{t}^{i}\right\}}_{i\ge 1}$ of independent and ${\mathcal{F}}_{t}$-adapted Brownian motions.

#### 2.3. Mean-Field Games

- The PDE approach through HJB Equation (21) and the Kolmogorov equation;
- The Backward Stochastic Differential Equation (BSDE) approach based on the PMP.

## 3. Main Result

#### 3.1. Neural Network as a Dynamical System

- the feed-forward dynamics $f:{\mathbb{R}}^{d}\times \Theta \to {\mathbb{R}}^{d}$;
- the terminal loss function $\Phi :{\mathbb{R}}^{d}\times {\mathbb{R}}^{l}\to \mathbb{R}$;
- the regularization term $L:{\mathbb{R}}^{d}\times \Theta \to \mathbb{R}$.

#### 3.2. HJB Equation

- f, L and $\Phi $ are bounded;
- f, L and $\Phi $ are Lipschitz w.r.t. x, and the Lipschitz constant of f and L are independent of $\theta $;
- ${\mu}_{0}\in {\mathcal{P}}_{2}\left({\mathbb{R}}^{(d+l)}\right)$.

**Theorem**

**1**

**Proof of Theorem 1.**

- $U(T,\xi )\le \mathbb{E}\left[\overline{\Phi}\left(\xi \right)\right]$ and for any test function $\psi \in {\mathcal{C}}^{1,1}([0,T]\times {L}^{2}(\Omega ,{\mathbb{R}}^{d+l}))$ such that $(U-\psi )$ has a local maximum at $({t}_{0},{\xi}_{0})\in [0,T)\times {L}^{2}(\Omega ,{\mathbb{R}}^{d+l})$, $\psi $ solves:$${\partial}_{t}\psi ({t}_{0},{\xi}_{0})+\mathcal{H}({\xi}_{0},D\psi ({t}_{0},{\xi}_{0}))\ge 0\phantom{\rule{0.166667em}{0ex}}.$$
- $U(T,\xi )\ge \mathbb{E}\left[\overline{\Phi}\left(\xi \right)\right]$ and for any test function $\psi \in {\mathcal{C}}^{1,1}([0,T]\times {L}^{2}(\Omega ,{\mathbb{R}}^{d+l}))$ such that the map $(U-\psi )$ has a local minimum at $({t}_{0},{\xi}_{0})\in [0,T)\times {L}^{2}(\Omega ,{\mathbb{R}}^{d+l})$, $\psi $ solves:$${\partial}_{t}\psi ({t}_{0},{\xi}_{0})+\mathcal{H}({\xi}_{0},D\psi ({t}_{0},{\xi}_{0}))\le 0\phantom{\rule{0.166667em}{0ex}}.$$

#### 3.3. Mean-Field Pontryagin Maximum Principle

- f is bounded and f, L are continuous w.r.t. $\theta $;
- f, L and $\Phi $ are continuously differentiable w.r.t. x;
- the distribution ${\mu}_{0}$ has bounded support in ${\mathbb{R}}^{d}\times {\mathbb{R}}^{l}$, which means there exists $M>0$ such that $\mu \left(\right\{(x,y)\in {\mathbb{R}}^{d}\times {\mathbb{R}}^{l}:\left|\right|x\left|\right|+\left|\right|y\left|\right|\le 1\left\}\right)=1$.

**Theorem**

**2**

**(Mean-Field**

**Pontryagin**

**Maximum**

**Principle).**

**Proof**

**of Theorem 2.**

#### 3.4. Connection between the HJB Equation and the PMP

#### 3.5. Small-Time Uniqueness

**Theorem**

**3**

- f is bounded;
- f, L and Φ are continuously differentiable w.r.t. both x and θ with bounded and Lipschitz partial derivatives;
- the distribution ${\mu}_{0}$ has bounded support in ${\mathbb{R}}^{d}\times {\mathbb{R}}^{l}$, which means there exists $M>0$ such that $\mu \{\phantom{\rule{0.166667em}{0ex}}(x,y)\in {\mathbb{R}}^{d}\times {\mathbb{R}}^{l}:|\left|x\right||+|\left|y\right||\le 1\phantom{\rule{0.166667em}{0ex}}\}=1$;
- $H(x,p,\theta )$ is strongly concave in θ and uniformly in x, $p\in {\mathbb{R}}^{d}$, i.e., ${\nabla}_{xx}^{2}H(x,p,\theta )+{\lambda}_{0}\u2aaf0$ for some ${\lambda}_{0}\ge 0$.

**Lemma**

**1.**

**Proof**

**of Lemma 1.**

**Proof**

**of Theorem 3.**

## 4. Conclusions

## Author Contributions

## Funding

## Conflicts of Interest

## References

- Cardaliguet, P. Notes on Mean-Field Games; Technical Report, from P.-L. Lions’ lectures at Collège de France; Collège de France: Paris, France, 2012. [Google Scholar]
- Touzi, N. Stochastic Control and Application to Finance; Chapter 1–4; Ecole Polytechnique: Paris, France, 2018. [Google Scholar]
- Carmona, R.; Delarue, F.; Lachapelle, A. Control of Mc-Kean-Vlasov Dynamics Versus Mean-Field Games; Technical Report; Springer: Berlin/Heidelberg, Germany, 2012. [Google Scholar]
- Li, Q.; Chen, L.; Tai, C.; E, W. Maximum principle based algorithms for deep learning. J. Mach. Learn. Res.
**2018**, 18, 1–29. [Google Scholar] - Weinan, E.; Han, J.; Li, Q. A Mean-Field Optimal Control Formulation of Deep Learning. arXiv
**2018**, arXiv:1807.01083. [Google Scholar] - Li, Q.; Lin, T.T.; Shen, Z. Deep Learning via Dynamical Systems: An Approximation Perspective. arXiv
**2019**, arXiv:1912.10382v1. [Google Scholar] - Athans, M.; Falb, P.L. Optimal Control: An Introduction to the Theory and Its Applications; Courier Corporation, Dover Publications, Inc.: Mineola, NY, USA, 2013. [Google Scholar]
- Lacker, D. Mean-Field Games and Interacting Particle Systems. 2018. Available online: http://www.columbia.edu/~dl3133/MFGSpring2018.pdf (accessed on 17 November 2020).
- Gangbo, W.; Kim, H.K.; Pacini, T. Differential forms on Wasserstein space and infinite-dimensional Hamiltonian systems. arXiv
**2009**, arXiv:0807.1065. [Google Scholar] [CrossRef] - Sagitov, S. Weak Convergence of Probability Measures; Chalmers University of Technology and Gothenburg University, 2015. Available online: http://www.math.chalmers.se/~serik/WeakConv/C-space.pdf (accessed on 17 November 2020).
- Billingsley, P. Weak Convergence in Metric Spaces; Wiley Series in Probability and Statistics: Hoboken, NJ, USA, 1999. [Google Scholar]
- Bergström, H. Weak Convergence of Measures. In A Volume in Probability and Mathematical Statistics: A Series of Monographs and Textbooks; Springer: New York, NY, USA, 1982. [Google Scholar]
- Benazzoli, C.; Campi, L.; Di Persio, L. Mean field games with controlled jump–diffusion dynamics: Existence results and an illiquid interbank market model. Stoch. Process. Their Appl.
**2020**, 130, 6927–6964. [Google Scholar] [CrossRef] - Benazzoli, C.; Campi, L.; Di Persio, L. ϵ-Nash equilibrium in stochastic differential games with mean-field interaction and controlled jumps. Stat. Probab. Lett.
**2019**. [Google Scholar] [CrossRef] - Carrillo, J.A.; Pimentel, E.A.; Voskanyan, V.K. On a mean-field optimal control problem. Nonlinear Anal. Theory Methods Appl.
**2020**, 199, 112039. [Google Scholar] [CrossRef] - Liberzon, D. Calculus of Variations and Optimal Control Theory: A Concise Introduction; Princeton University Press: Princeton, NJ, USA, 2012. [Google Scholar]
- Ma, J.; Yong, J. Forward-Backward Stochastic Differential Equations and Their Applications; Springer: New York, NY, USA, 2007. [Google Scholar]
- Hadikhanloo, S. Learning in Mean-Field Games. Ph.D. Thesis, Dauphine Universitè Paris, Paris, France, 2018. [Google Scholar]
- Fleming, W.H.; Soner, H.M. Controlled Markov Processes and Viscosity Solutions; Stochastic Modelling and Applied Probability; Springer: New York, NY, USA, 2006. [Google Scholar]
- Frankowska, H. Hamilton-Jacobi equations: Viscosity solutions and generalized gradients. J. Math. Anal. Appl.
**1989**, 141, 1. [Google Scholar] [CrossRef] [Green Version] - Weinan, E. A Proposal on Machine Learning via Dynamical Systems. arXiv
**2019**, arXiv:1912.10382v1. [Google Scholar] - Kelley, W.G.; Peterson, A.C. The Theory of Differential Equations: Classical and Qualitative; Springer Science and Business Media: Berlin, Germany, 2010. [Google Scholar]

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |

© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Persio, L.D.; Garbelli, M.
Deep Learning and Mean-Field Games: A Stochastic Optimal Control Perspective. *Symmetry* **2021**, *13*, 14.
https://doi.org/10.3390/sym13010014

**AMA Style**

Persio LD, Garbelli M.
Deep Learning and Mean-Field Games: A Stochastic Optimal Control Perspective. *Symmetry*. 2021; 13(1):14.
https://doi.org/10.3390/sym13010014

**Chicago/Turabian Style**

Persio, Luca Di, and Matteo Garbelli.
2021. "Deep Learning and Mean-Field Games: A Stochastic Optimal Control Perspective" *Symmetry* 13, no. 1: 14.
https://doi.org/10.3390/sym13010014