ForwardBackward Sweep Method for the System of HJBFP Equations in MemoryLimited Partially Observable Stochastic Control
Abstract
:1. Introduction
2. MemoryLimited Partially Observable Stochastic Control
2.1. Problem Formulation
2.2. Problem Reformulation
3. Pontryagin’s Minimum Principle
3.1. Preliminary
3.2. Necessary Condition
3.3. Sufficient Condition
3.4. Relationship with Bellman’s Dynamic Programming Principle
3.5. Relationship with Completely Observable Stochastic Control
4. ForwardBackward Sweep Method
4.1. ForwardBackward Sweep Method
Algorithm 1: ForwardBackward Sweep Method (FBSM) 

4.2. Preliminary
4.3. Monotonicity
4.4. Convergence to Pontryagin’s Minimum Principle
5. LinearQuadraticGaussian Problem
5.1. Problem Formulation
5.2. Pontryagin’s Minimum Principle
5.3. ForwardBackward Sweep Method
Algorithm 2: ForwardBackward Sweep Method (FBSM) in the LQG problem 

6. Numerical Experiments
6.1. LQG Problem
6.2. NonLQG Problem
7. Discussion
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
Abbreviations
COSC  Completely Observable Stochastic Control 
POSC  Partially Observable Stochastic Control 
MLPOSC  MemoryLimited Partially Observable Stochastic Control 
MFSC  MeanField Stochastic Control 
FBSM  ForwardBackward Sweep Method 
HJB  HamiltonJacobiBellman 
FP  FokkerPlanck 
SDE  Stochastic Differential Equation 
ODE  Ordinary Differential Equation 
LQG  LinearQuadraticGaussian 
Appendix A. Deterministic Control
Appendix A.1. Problem Formulation
Appendix A.2. Preliminary
Appendix A.3. Necessary Condition
Appendix A.4. Sufficient Condition
Appendix A.5. Relationship with Bellman’s Dynamic Programming Principle
Appendix B. MeanField Stochastic Control
Appendix B.1. Problem Formulation
Appendix B.2. Preliminary
Appendix B.3. Necessary Condition
Appendix B.4. Sufficient Condition
Appendix B.5. Relationship with Bellman’s Dynamic Programming Principle
Appendix C. Derivation of Main Results
Appendix C.1. Derivation of Result in Section 3.1
Appendix C.2. Derivation of Result in Section 3.2
Appendix C.3. Derivation of Result in Section 3.3
Appendix C.4. Derivation of Result in Section 3.5
Appendix C.5. Derivation of Result in Section 4.2 by the Similar Way as Pontyragin’s Minimum Principle
Appendix C.6. Derivation of Result in Section 4.2 by the Time Discretized Method
Appendix C.7. Derivation of Result in Section 4.3
Appendix C.8. Derivation of Result in Section 4.4
Appendix C.9. Derivation of Result in Section 5.3
References
