# Information Design for Multiple Interdependent Defenders: Work Less, Pay Off More

^{1}

^{2}

^{3}

^{*}

## Abstract

**:**

## 1. Introduction

How can one obtain defense effectiveness and computation efficiency in multi-defender security games?

#### Comparison with Previous Works

**Defense and Attack Models for Security:**There is extensive literature studying defense and attack models for security. Our discussion here cannot do justice to this large body of literature; we thus refer interested readers to an excellent survey paper [14], which classifies previously studied models according to three dimensions: system structure (eight types), defense measures (six types), and attack tactics and circumstances (seven types). Under their language, our work falls within the multiple elements system structure, protection defense measure, and attack against single element attack tactics. Prior to the writing of the survey article, not much previous work fell into this particular category. The most relevant publications for us are [15,16], which provide a systematic equilibrium analysis for the strategic game between a multi-resource defender and a single-element attacker. However, these works differ from ours in two key aspects: (1) their study is analytical, whereas our work is computational and tries to find the optimal defense strategy; (2) they consider simultaneous-move games, whereas our game is sequential and falls into the Stackelberg game framework. Specifically, we adopt the widely used class of games termed Stackelberg Security games, which capture strategic interactions between defenders and attackers in security domains [17]. This research advance comes slightly after the survey article [14], but has nevertheless led to a significant impact with deployed real-world applications, e.g., for airport security [18], ferry protection [19], and wildlife conservation [20].

**Strategic Information Disclosure for Security:**Information design, also known as signaling, has attracted much interest in various domains, such as public safety [7,8], wildlife conservation [26], traffic routing [13,27], and auctions [28,29]. Most-related to us is [30], which studies signaling in Bayesian Stackelberg games. All previous work assumes a single defender, whereas our paper tackles the complex multiple-defender setup. This requires us to work with exponentially large representations of signaling schemes and necessitates novel algorithmic techniques with compact representations.

**Other Learning-based Solutions:**Recent research in multi-agent reinforcement learning (MARL) has studied factors that influence agents’ behavior in a shared environment. For example, [31] studies how to convey private information through actions in cooperative environments. Ref. [32] uses monetary reward (which they call ’causal inference reward’) to influence opponents’ actions. Unlike the tools studied in previous multi-agent reinforcement learning literature, our model takes advantage of information asymmetry between the principal and various stakeholders (including both defending agencies and the attacker) to influence their actions. Therefore, both our setup and approach are different from these previous learning-based methods.

## 2. Preliminary

## 3. Optimal Private Signaling

**Example**

**1.**

#### 3.1. An Exponential-Size LP Formulation

#### 3.2. A Polynomial-Time Algorithm

**Theorem**

**1.**

**Step 1: Restricting to simplified pure strategy space.**One challenge of designing the signaling scheme is when multiple defenders are recommended a same target, which significantly complicates computation of marginal target protection. Therefore, our first step is to simplify the pure strategy space to include only those in which all defenders cover different targets. To do so, we create $\mathbf{D}$ dummy targets for which rewards, penalties, and costs are zero for both the defenders and the attacker.5 When the players choose one of these dummy targets, it means they choose to do nothing. As a result, we have $(\mathbf{T}+\mathbf{D})$ targets in total, including these dummy targets. The creation of these dummy targets does not influence the actual outcome of any signaling scheme but introduces a nice characteristic of the optimal signaling scheme (Lemma 1). This characteristic of at most one defender at each target allows us to provide more efficient algorithms to find an optimal signaling scheme.

**Lemma**

**1.**

**Proof.**

**Step 2: Working in the dual space.**Since LP (1)–(4) has exponentially many variables, we first reduce it to the following dual linear program (1)–(4) via the standard linear duality theory [17], which turns out to be more tractable to work with:

**Step 3: Establishing an efficient separation oracle.**We now solve (8) for any given $(\lambda ,{t}_{0})$. We further divide this problem into two sub-problems; each can be solved via bipartite matching (which is polynomial time). More specifically, we divide the signal set $\{\mathbf{s}:s\left(a\right)={t}_{0}\}$ into two different subsets, as elaborated in the following.

**Case 1 of Step 3:**Attacked target is not covered. The first subset consists of all signals such that ${t}_{0}\notin s(-a)$; that is, none of the defenders are assigned to ${t}_{0}$. In this case, the attacker will receive a reward ${R}^{\lambda}\left({t}_{0}\right)$ for attacking ${t}_{0}$, while every defender d receives a penalty ${P}^{d}\left({t}_{0}\right)$. Thus, each of the following elements in (8) is straightforward to compute:

**Lemma**

**2.**

**Case 2 of Step 3:**Attacked target is covered. On the other hand, the second subset consists of all signals such that ${t}_{0}$ is assigned to one of the defender. In this case, we further divide this sub-problem into multiple smaller problems by fixing the defender who covers ${t}_{0}$, denoted by ${d}_{0}$. Similar to Sub-problem P1, we introduce the following weights: $\forall t$

**Lemma**

**3.**

## 4. Optimal Ex Ante Private Signaling

#### 4.1. An Exponential-Size LP Formulation

**Theorem**

**2.**

#### 4.2. Compact Signaling Representation

**Theorem**

**3.**

**Proof.**

**Lemma**

**4.**

- $\omega ({d}_{i},{t}_{i})>0$ for all $i\in \{1,\dots ,|\mathbf{D}\left|\right\}$
- Every maximally covered target t, i.e., ${\sum}_{d}\omega (d,t)=r$, is assigned to a defender; that is, $t\in \{{t}_{1},\dots ,{t}_{\left|\mathbf{D}\right|}\}$.

**Proof.**

**Step 1:**Inclusion of high-coverage target group ${\mathbf{T}}^{\mathrm{high}}$. We first prove that there is a partial allocation from defenders to targets in ${\mathbf{T}}^{\mathrm{high}}$, denoted by $({d}_{1},\dots ,{d}_{H})$, such that ${d}_{i}\in \mathbf{D}\left(t\right)$ for all ${t}_{i}\in {\mathbf{T}}^{\mathrm{high}}$, and they are pair-wise different, i.e., ${d}_{i}\phantom{\rule{-0.166667em}{0ex}}\ne \phantom{\rule{-0.166667em}{0ex}}{d}_{j}$ for all ${t}_{i}\phantom{\rule{-0.166667em}{0ex}}\ne \phantom{\rule{-0.166667em}{0ex}}{t}_{j}\phantom{\rule{-0.166667em}{0ex}}\in \phantom{\rule{-0.166667em}{0ex}}{\mathbf{T}}^{\mathrm{high}}$. We use induction with respect to t.

**Observation**

**1.**

- $\exists {d}_{0}\notin \{{d}_{1},\dots ,{d}_{t}\}$ s.t $\omega ({d}_{0}\to {t}^{\prime}+1)>0$
- $\exists {d}_{00}\notin \{{d}_{1},\dots ,{d}_{t}\}\backslash \left\{{d}_{{t}^{\prime}+1}\right\}$ s.t. $\omega ({d}_{00}\to {t}^{\prime}+2)>0$
- …
- $\exists {d}_{\mathrm{final}}\notin \{{d}_{1},\dots ,{d}_{{t}^{\prime}}\}$ and $\exists {t}_{\mathrm{final}}\in \{1,\dots ,{t}^{\prime}\}$ such that $\omega ({d}_{\mathrm{final}}\to {t}_{\mathrm{final}})>0$ where $\mathrm{final}={\left[0\right]}^{t-{t}^{\prime}}$.

**Step 2:**Extension to include target group ${\mathbf{T}}^{\mathrm{low}}$. We are going to prove that there is an assignment from defenders $\mathbf{D}$ to $\left|\mathbf{D}\right|$ targets that includes all targets in ${\mathbf{T}}^{\mathrm{high}}$. We apply induction with respect to the defender d. Note that we cannot apply induction with respect to the targets t since we include target group ${\mathbf{T}}^{low}$ in this analysis, and as a result, the equality on the LHS of (19) no longer holds.

## 5. Experiments

## 6. Summary

## Author Contributions

## Funding

## Data Availability Statement

## Acknowledgments

## Conflicts of Interest

## Abbreviations

BCE | Bayes correlated equilibrium |

CPU | Central processing unit |

LHS | Left-hand side |

LP | Linear program |

MARL | Multi-agent reinforcement learning |

NSE | Nash Stackelberg equilibrium |

RHS | Right-hand side |

## Appendix A. Proof of Lemma 4

#### Extension to Include Target Group ${\mathbf{T}}^{\mathrm{low}}$

- $\exists {t}_{0}\notin \{{t}_{1},\dots ,{d}_{d}\}$ s.t $\omega ({d}^{\prime}+1\to {t}_{0})>0$
- $\exists {t}_{00}\notin \{{t}_{1},\dots ,{t}_{d}\}\backslash \left\{{t}_{{d}^{\prime}+1}\right\}$ s.t. $\omega ({d}_{00}\to {d}^{\prime}+2)>0$
- …
- $\exists {t}_{\mathrm{final}}\notin \{{t}_{1},\dots ,{t}_{{d}^{\prime}}\}$ and $\exists {d}_{\mathrm{final}}\in \{1,\dots ,{d}^{\prime}\}$ such that $\omega ({d}_{\mathrm{final}}\to {t}_{\mathrm{final}})>0$ where $\mathrm{final}={\left[0\right]}^{d-{d}^{\prime}}$.

## Appendix B. Additional Experiments

**Figure A3.**The principal optimizes her own utility when $\left|\mathbf{T}\right|\phantom{\rule{-0.166667em}{0ex}}=12$ and the defenders’ cost range ${C}^{d}\left(t\right)\phantom{\rule{-0.166667em}{0ex}}\in \phantom{\rule{-0.166667em}{0ex}}[0,10]$.

## Notes

1 | Notably, the defender can signal to the attacker as well, to either deter him from attacking or induce him to attack a specific target in order to catch him. Previous works have shown that this can benefit the defender [7,8] even though the attacker is fully aware of the strategic nature of the signal and will best respond to the revealed information. |

2 | This is without loss of generality, since any defender who can cover multiple targets can be “split” into multiple defenders with the same utilities. |

3 | The term “suggested” here should only be interpreted mathematically—i.e., given all the attacker’s available information, $s\left(a\right)$ is identified as the most profitable target for the attacker to attack—and should not be interpreted as a real practice that the defender suggests the attacker to attack some target. Such a formulation, analogous to the revelation principle, is used for the convenience of formulating the optimization problem. |

4 | Such information can often by learned from informants such as local villagers [34]. |

5 | In reality, such a dummy target could be unimportant infrastructure (e.g., a nearby rest area at the border of a national park with no animals around, as in wildlife conservation), which does not matter to any defender nor the attacker. |

6 | Since we have $(\mathbf{T}+\mathbf{D})$ targets in total while there are only $\left|\mathbf{D}\right|$ defenders, some targets will not be assigned to any defenders. |

## References

- Jiang, A.X.; Procaccia, A.D.; Qian, Y.; Shah, N.; Tambe, M. Defender (mis) coordination in security games. In Proceedings of the Twenty-Third International Joint Conference on Artificial Intelligence, Beijing, China, 3–9 August 2013. [Google Scholar]
- Ministry of WWF-Pakistan. National Plan of Action for Combating Illegal Wildlife Trade in Pakistan; Ministry of WWF-Pakistan: Islamabad, Pakistan, 2015. [Google Scholar]
- Klein, N. Can International Litigation Solve the India-Sri Lanka Fishing Dispute? 2017. Available online: https://researchers.mq.edu.au/en/publications/can-international-litigation-solve-the-india-sri-lanka-fishing-di (accessed on 10 October 2022).
- Lou, J.; Smith, A.M.; Vorobeychik, Y. Multidefender security games. IEEE Intell. Syst.
**2017**, 32, 50–60. [Google Scholar] [CrossRef] [Green Version] - Gan, J.; Elkind, E.; Wooldridge, M. Stackelberg Security Games with Multiple Uncoordinated Defenders. In Proceedings of the AAMAS ’18, Stockholm, Sweden, 10–15 July 2018; International Foundation for Autonomous Agents and Multiagent Systems: Richland, SC, USA, 2018; pp. 703–711. [Google Scholar]
- Dughmi, S. Algorithmic Information Structure Design: A Survey. ACM Sigecom Exch.
**2017**, 15, 2–24. [Google Scholar] [CrossRef] - Xu, H.; Rabinovich, Z.; Dughmi, S.; Tambe, M. Exploring Information Asymmetry in Two-Stage Security Games. In Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), Austin, TX, USA, 25–30 January 2015. [Google Scholar]
- Rabinovich, Z.; Jiang, A.X.; Jain, M.; Xu, H. Information Disclosure as a Means to Security. In Proceedings of the 2015 International Conference on Autonomous Agents and Multiagent Systems, Istanbul, Turkey, 4–8 May 2015. [Google Scholar]
- Viollaz, J.; Gore, M. Piloting Community-Based Conservation Crime Prevention in the Annamite Mountains; Michigan State University: East Lansing, MI, USA, 2019. [Google Scholar]
- Dughmi, S.; Xu, H. Algorithmic persuasion with no externalities. In Proceedings of the 2017 ACM Conference on Economics and Computation, Cambridge, MA, USA, 26–30 June 2017; pp. 351–368. [Google Scholar]
- Bergemann, D.; Morris, S. Bayes correlated equilibrium and the comparison of information structures in games. Theor. Econ.
**2016**, 11, 487–522. [Google Scholar] [CrossRef] [Green Version] - Papadimitriou, C.H.; Roughgarden, T. Computing correlated equilibria in multi-player games. JACM
**2008**, 55, 1–29. [Google Scholar] [CrossRef] - Castiglioni, M.; Celli, A.; Marchesi, A.; Gatti, N. Signaling in Bayesian Network Congestion Games: The Subtle Power of Symmetry. Proc. AAAI Conf. Artif. Intell.
**2021**, 35, 5252–5259. [Google Scholar] [CrossRef] - LEVITIN, K.H.G. Review of systems defense and attack models. Int. J. Perform. Eng.
**2012**, 8, 355. [Google Scholar] - Bier, V.; Oliveros, S.; Samuelson, L. Choosing what to protect: Strategic defensive allocation against an unknown attacker. J. Public Econ. Theory
**2007**, 9, 563–587. [Google Scholar] [CrossRef] - Powell, R. Defending against terrorist attacks with limited resources. Am. Political Sci. Rev.
**2007**, 101, 527–541. [Google Scholar] [CrossRef] [Green Version] - Tambe, M. Security and Game Theory: Algorithms, Deployed Systems, Lessons Learned; Cambridge University Press: Cambridge, UK, 2011. [Google Scholar]
- Pita, J.; Jain, M.; Marecki, J.; Ordóñez, F.; Portway, C.; Tambe, M.; Western, C.; Paruchuri, P.; Kraus, S. Deployed armor protection: The application of a game theoretic model for security at the los angeles international airport. In Proceedings of the AAMAS: Industrial Track, Estoril, Portugal, 12–16 May 2008; pp. 125–132. [Google Scholar]
- Shieh, E.; An, B.; Yang, R.; Tambe, M.; Baldwin, C.; DiRenzo, J.; Maule, B.; Meyer, G. Protect: A deployed game theoretic system to protect the ports of the united states. In Proceedings of the AAMAS, Valencia, Spain, 4–8 June 2012; pp. 13–20. [Google Scholar]
- Fang, F.; Nguyen, T.H.; Pickles, R.; Lam, W.Y.; Clements, G.R.; An, B.; Singh, A.; Tambe, M.; Lemieux, A. Deploying PAWS: Field optimization of the protection assistant for wildlife security. In Proceedings of the Twenty-Eighth IAAI Conference, Phoenix, AZ, USA, 12–17 February 2016. [Google Scholar]
- Basilico, N.; Celli, A.; De Nittis, G.; Gatti, N. Computing the team–maxmin equilibrium in single–team single–adversary team games. Intell. Artif.
**2017**, 11, 67–79. [Google Scholar] [CrossRef] [Green Version] - Laszka, A.; Lou, J.; Vorobeychik, Y. Multi-defender strategic filtering against spear-phishing attacks. In Proceedings of the AAAI Conference on Artificial Intelligence, Phoenix, AZ, USA, 12–17 February 2016; Volume 30. [Google Scholar]
- Lou, J.; Vorobeychik, Y. Decentralization and security in dynamic traffic light control. In Proceedings of the Symposium and Bootcamp on the Science of Security, Pittsburgh, PA, USA, 19–21 April 2016; pp. 90–92. [Google Scholar]
- Lou, J.; Vorobeychik, Y. Equilibrium analysis of multi-defender security games. In Proceedings of the Twenty-Fourth International Joint Conference on Artificial Intelligence, Buenos Aires, Argentina, 25–31 July 2015. [Google Scholar]
- Smith, A.; Vorobeychik, Y.; Letchford, J. Multidefender security games on networks. ACM Sigmetrics Perform. Eval. Rev.
**2014**, 41, 4–7. [Google Scholar] [CrossRef] - Bondi, E.; Oh, H.; Xu, H.; Fang, F.; Dilkina, B.; Tambe, M. Broken signals in security games: Coordinating patrollers and sensors in the real world. In Proceedings of the AAMAS, Montreal, QC, Canada, 13–17 May 2019; pp. 1838–1840. [Google Scholar]
- Vasserman, S.; Feldman, M.; Hassidim, A. Implementing the wisdom of waze. In Proceedings of the Twenty-Fourth International Joint Conference on Artificial Intelligence, Buenos Aires, Argentina, 25–31 July 2015. [Google Scholar]
- Li, Z.; Das, S. Revenue enhancement via asymmetric signaling in interdependent-value auctions. In Proceedings of the AAAI Conference on Artificial Intelligence, Honolulu, HI, USA, 27 January–1 February 2019; Volume 33, pp. 2093–2100. [Google Scholar]
- Emek, Y.; Feldman, M.; Gamzu, I.; Paes Leme, R.; Tennenholtz, M. Signaling Schemes for Revenue Maximization. In Proceedings of the 13th ACM Conference on Electronic Commerce, EC ’12, Valencia, Spain, 4–8 June 2012; pp. 514–531. [Google Scholar]
- Xu, H.; Freeman, R.; Conitzer, V.; Dughmi, S.; Tambe, M. Signaling in Bayesian Stackelberg Games. In Proceedings of the AAMAS, Singapore, 9–13 May 2016; pp. 150–158. [Google Scholar]
- Tian, Z.; Zou, S.; Davies, I.; Warr, T.; Wu, L.; Ammar, H.B.; Wang, J. Learning to Communicate Implicitly by Actions. Proc. AAAI Conf. Artif. Intell.
**2020**, 34, 7261–7268. [Google Scholar] [CrossRef] - Jaques, N.; Lazaridou, A.; Hughes, E.; Gulcehre, C.; Ortega, P.; Strouse, D.; Leibo, J.Z.; De Freitas, N. Social Influence as Intrinsic Motivation for Multi-Agent Deep Reinforcement Learning. In Proceedings of the 36th International Conference on Machine Learning, Long Beach, CA, USA, 9–15 June 2019; Volume 97, pp. 3040–3049. [Google Scholar]
- Kamenica, E.; Gentzkow, M. Bayesian persuasion. Am. Econ. Rev.
**2011**, 101, 2590–2615. [Google Scholar] [CrossRef] [Green Version] - Shen, W.; Chen, W.; Huang, T.; Singh, R.; Fang, F. When to follow the tip: Security games with strategic informants. In Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, Yokohama, Japan, 7–15 January 2020. [Google Scholar]
- Grötschel, M.; Lovász, L.; Schrijver, A. The ellipsoid method and its consequences in combinatorial optimization. Combinatorica
**1981**, 1, 169–197. [Google Scholar] [CrossRef] - Xu, H. On the tractability of public persuasion with no externalities. In Proceedings of the Fourteenth Annual ACM-SIAM Symposium on Discrete Algorithms; SIAM: Philadelphia, PA, USA, 2020; pp. 2708–2727. [Google Scholar]
- Yin, Z.; Tambe, M. A unified method for handling discrete and continuous uncertainty in bayesian stackelberg games. In Proceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems-Volume 2, Auckland, New Zealand, 9–13 May 2012; pp. 855–862. [Google Scholar]
- Nguyen, T.H.; Jiang, A.X.; Tambe, M. Stop the compartmentalization: Unified robust algorithms for handling uncertainties in security games. In Proceedings of the AAMAS, Paris, France, 5–9 May 2014; pp. 317–324. [Google Scholar]

**Figure 1.**Average defender social welfare: the defenders’ cost range is fixed to $[0,10]$ in sub-figures (

**b**–

**d**).

**Figure 2.**Average attacker utility: the defenders’ cost range is fixed to $[0,10]$ in sub-figures (

**b**–

**d**).

**Figure 4.**The principal optimizes her own utility when $\left|\mathbf{D}\right|\phantom{\rule{-0.166667em}{0ex}}=\phantom{\rule{-0.166667em}{0ex}}|\Lambda |\phantom{\rule{-0.166667em}{0ex}}=\phantom{\rule{-0.166667em}{0ex}}4$ and the defenders’ cost range ${C}^{d}\left(t\right)\phantom{\rule{-0.166667em}{0ex}}\in \phantom{\rule{-0.166667em}{0ex}}[0,10]$.

**Figure 6.**Scalability of target number or attacker types in ex ante setting when $\left|D\right|=4$ and the defenders’ cost range ${C}^{d}\left(t\right)\phantom{\rule{-0.166667em}{0ex}}\in \phantom{\rule{-0.166667em}{0ex}}[0,10]$.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Zhou, C.; Spivey, A.; Xu, H.; Nguyen, T.H.
Information Design for Multiple Interdependent Defenders: Work Less, Pay Off More. *Games* **2023**, *14*, 12.
https://doi.org/10.3390/g14010012

**AMA Style**

Zhou C, Spivey A, Xu H, Nguyen TH.
Information Design for Multiple Interdependent Defenders: Work Less, Pay Off More. *Games*. 2023; 14(1):12.
https://doi.org/10.3390/g14010012

**Chicago/Turabian Style**

Zhou, Chenghan, Andrew Spivey, Haifeng Xu, and Thanh H. Nguyen.
2023. "Information Design for Multiple Interdependent Defenders: Work Less, Pay Off More" *Games* 14, no. 1: 12.
https://doi.org/10.3390/g14010012