Stackelberg Game-Theoretic Low Probability of Intercept Performance Optimization for Multistatic Radar System

: In this paper, the problem of Stackelberg game-theoretic low probability of intercept (LPI) performance optimization in multistatic radar system is investigated. The goal of the proposed LPI optimization strategy is to minimize the transmitted power of each radar while satisfying a predetermined signal-to-interference-plus-noise ratio (SINR) requirement for target detection. Firstly, a single-leader multi-follower Stackelberg game is adopted to formulate the LPI optimization problem of multistatic radar system. In the considered game model, the hostile intercept receiver plays a role of leader, who decides the prices of power resource ﬁrst through the maximization of its own utility function. The multiple radars are followers to compete with each other in a non-cooperative game according to the imposed prices from the intercept receiver subsequently. Then, the Nash equilibrium (NE) for the considered game model is derived, and the existence and uniqueness of the NE are analytically proved. Furthermore, a pricing-based distributed iterative power control algorithm is proposed. Finally, some simulation examples are provided to demonstrate that the proposed scheme has remarkable potential to enhance the LPI performance of the multistatic radar system.


Introduction
With the recent technology development of advanced passive intercept systems, it is highly important to find a solution to the problem of low probability of intercept (LPI) performance enhancement for different radar systems caused by this trend [1]. In theory, low radiated power [2][3][4], short dwell time [5,6], large revisit interval [7], adaptive beamforming [8][9][10], and waveform optimization [11] will result in improved LPI performance. In recent years, researchers have developed a great number of techniques to satisfy the LPI performance requirement. For example, in [2], the revisit interval, transmit power, and waveform parameters are jointly optimized to improve the LPI performance for radar networks. In [4], an LPI-bsaed joint transmitter selection and resource management scheme is proposed for single target tracking in radar network, where the LPI performance criterion of radar network is minimized by optimizing the revisit interval, dwell time, transmitter selection, and transmit power while maintaining a specified target tracking accuracy. In [6], Wang X.L. et al. propose a joint revisit and dwell time management approach for target tracking in phased array radar system, in which the time resource consumption is minimized for a given target tracking performance. Lawrence D.E. in [8] introduces a novel transmit array beamforming algorithm, which offers a better LPI performance for surveillance radars exploiting phased array antennas. Reference [10] presents a cognitive LPI-based transmit beamforming algorithm by utilizing frequency diverse array (FDA) and multiple-input multiple-output (MIMO) hybrid array antenna, which minimizes the beam power at the target location while maximizing the power at the radar receiver without degrading the target detection performance. The work in [11] addresses the problem of power minimization-based robust orthogonal frequency division multiplexing (OFDM) radar waveform design for radar and communication systems in spectral coexistence. It is shown that the LPI performance of the radar system can be efficiently strengthened by employing the communication waveforms scattered off the target at radar receiver.
Game theory provides a natural and efficient tool in modeling the interactions among independent players [12][13][14]. A lot of work has been developed for radar systems and made significant progress [15][16][17][18][19][20]. In [16], the authors model the interaction between a smart target and a smart MIMO radar as a two-person zero-sum game. The non-cooperative game-based code design approach in radar networks is proposed in [17], whose purpose is to maximize the signal-to-interference-plus-noise ratio (SINR) of each radar. The work in [18] studies the game-theoretic power allocation for a distributed MIMO radar network. Shi C.G. et al. in [19] propose a novel non-cooperative game-theoretic power allocation strategy for multistatic radar in a spectrum sharing environment. As an extension, a cooperative game-theoretic framework is proposed for power control in multistatic radar underlaying a communication system [20]. However, the above game-theoretic resource management protocols ignore the presence of hostile intercept receiver. The Stackelberg game-theoretic model has been exploited in several research [21][22][23][24] to analyze the hierarchical competition with different optimization purposes. In [25], the authors formulate a MIMO radar and target Stackelberg game model in the presence of clutter, and various optimization criteria at Stackelberg equilibrium (SE) of target dominant and radar dominant are obtained, respectively. However, in view of the above studies, there are still no published references that investigate this hierarchical interactions between the hostile intercept receiver and multistatic radar system. Therefore, the problem of Stackelberg game-theoretic LPI performance optimization for multistatic radar system should be addressed.
Specifically, the major contributions of this study are as follows: (1) The problem of Stackelberg game-theoretic LPI performance optimization strategy for multistatic radar system is investigated. Mathematically, the LPI optimization strategy can be formulated as a problem of minimizing the radiated power of each radar for a specified target detection performance. In earlier literature, although both non-cooperative game [19] and cooperative game [20] have been utilized to control the transmit power of multistatic radar system, as aforementioned, the hostile intercept receiver is not considered in such a scenario. Therefore, we take the hierarchical interactions between intercept receiver and multiple radars into consideration and formulate the LPI performance optimization between them as a single-leader multiple-follower Stackelberg game. In the underlying game model, the hostile intercept receiver plays a role of leader, who decides the prices of unit power resource first through the maximization of its own utility. The multiple radars are followers to compete with each other in a non-cooperative game according to the imposed prices from the interceptor subsequently. (2) We incorporate the total received power at intercept receiver, the unit power prices, the specified SINR requirement, and the transmit power of each radar to define the novel utility functions for the single leader and multiple followers. Then, we analyze the followers' non-cooperative game model with the released prices from the leader, and the Nash equilibrium (NE) solution for the considered game model is derived. Additionally, the existence and uniqueness of the NE solution are strictly proved. (3) A pricing-based distributed iterative power control method is presented to solve the resulting optimization problem, which guarantees the convergence to the Stackelberg equilibrium (SE) points. (4) Some numerical examples are provided to confirm the convergence of the approach to the unique SE solution and verify the effectiveness of our proposed strategy in terms of LPI performance enhancement.
The rest of this paper is structured as follows: Section 2 provides the system model and assumptions. In Section 3, the Stackelberg game model for the LPI performance optimization is formulated. Section 4 provides the numerical results and analyses to demonstrate the proposed strategy. Finally, Section 5 concludes this paper.

System Model and Assumptions
In this work, we consider a multistatic radar system composed of Q R radars, whose purpose is to minimize the radiated power of each radar while guaranteeing a predetermined SINR requirement for target detection. As illustrated in Figure 1, the ith radar receives the echoes from the target due to its emitted waveforms as well as the waveforms from the other radars, both scattered off the target and through a direct path. The signals transmitted from different radars might not be orthogonal due to various reasons, including the absence of radar transmission synchronization [18], which could induce significant mutual interference. It is supposed that successive interference cancellation (SIC) technique is employed at each radar receiver to remove both direct and target scattered communication signals from the observed signal [26]. In the considered multistatic system, each radar performs target detection autonomously and sends its local decision to the fusion center, which takes a global decision once the data coming from all the radars is collected. It is also assumed that each radar can determine the presence of a target by employing a binary hypothesis testing on the received signal based on the generalized likelihood ratio test (GLRT) [18,20,26]. Thus, the M time-domain samples of the received signals for the ith radar, with H 0 corresponding to the target absence hypothesis and H 1 corresponding to the target presence hypothesis, can be expressed by: denotes the Doppler steering vector of radar i with respect to the target, f D,i is the Doppler shift associated with the radar i, M is the number of received pulses during the dwell time, and φ i is the predesigned waveform emitted from radar i. i represents the channel gain at the direction of the target, P i is the transmit power of radar i, ξ i,j stands for the cross gain between radar i and j, and n i denotes a zero-mean white Gaussian noise with variance σ 2 n . It is assumed represents the variance of the channel gain for the radar i-target-radar i path, c i,j h T i,j represents the variance of the channel gain for the radar i-target-radar j path, c i,j h D i,j represents the variance of the channel gain for the direct radar i-radar j path, and c i,j denotes the cross correlation coefficient between the ith radar and jth radar.
The propagation gains of the corresponding paths are defined as follows: where h T i,i represents the propagation gain for the radar i-target-radar i path, h T i,j represents the propagation gain for the radar i-target-radar j path, h D i,j represents the direct radar i-radar j path, g D i represents the direct radar i-intercept receiver path. G T is the radar main-lobe transmitting antenna gain, G R is the radar main-lobe receiving antenna gain, G T is the radar side-lobe transmitting antenna gain, G R is the radar side-lobe receiving antenna gain, and G I is the interceptor receiving antenna gain.
is the RCS of the target with respect to the ith radar, σ RCS i,j is the RCS of the target from radar i to radar j, λ denotes the wavelength, R i denotes the distance from radar i to the target, R j denotes the distance from radar j to the target, d i,j denotes the distance between radar i and radar j, and d i denotes the distance between radar i and intercept receiver. It is supposed that all the path propagation gains are fixed during observation period.

Stackelberg Game Formulation
Stackelberg game is a strategic game, which is composed of a single leader and multiple followers competing with each other [21,22]. In this study, the hostile intercept receiver plays a role of leader, who decides the prices on per unit of received power from different radars first through the maximization of its own profit. While the radars are multiple followers, which move subsequently and compete selfishly in a non-cooperative Nash game in view of the assigned prices. In the considered LPI optimization problem, multiple radars in the multistatic system are selfish, which act solely according to their own strategies. From the hostile interceptor's point of view, those selfish moves lead to inefficient power resource utilization and degradation of the LPI performance of multistatic radar system. In this sequel, our propose is to formulate a Stackelberg game-theoretic power contorl strategy among different radars subject to a specified SINR requirement for target detection.

Leader-Level Game
Under the Stackelberg game model, the principal aim of the intercept receiver is to maximize its own utility from selling the received power to different radars. Mathematically, the utility of the intercept receiver can be written as: where ϕ is the price vector with , ϕ i is the unit power price for the ith radar, u(·) denotes step function, and S min denotes the sensitivity of the interceptor. It should be pointed out that the transmit power of the ith radar P i depends on the imposed price ϕ i in the Stackelberg game formulation. To this end, the intercept receiver needs to find the best price ϕ i to maximize its own revenue. Hence, the optimization problem of the intercept receiver can be expressed as:

Follower-Level Game
At the multistatic radar system's side, the GLRT is adopted to determine the appropriate detector [26]. The probabilities of detection p D,i (α i , γ i ) and false alarm p FA,i (α i ) can be written as: where α i is the detection threshold, M is the total number of received pulses during the dwell time.
γ i denotes the SINR received at the ith radar expressed by: where σ 2 n represents the background noise at the ith radar, I −i denotes the total interference and noise received at the ith radar, that is, As mentioned before, the multiple radars take the part of followers to maximize their individual utilities through power control in a non-cooperative Nash game. Thus, the utility of each radar can be defined as: where P −i = [P 1 , · · · , P i−1 , P i+1 , · · · , P Q R ] T ∈ R (Q R −1)×1 denotes the vector of power allocation for all radars apart from radar i, γ min is the desired SINR threshold for target detection, and P max i is the maximum transmit power of radar i. From (5), it is noticed that the utility function of each radar is composed of profit part and cost part. If each radar increases its transmit power, the target detection performance improves, and thus the profit goes up. However, as the transmit power increases, the power consumption and the received power at intercept receiver are increased, and so does the cost [21,22]. Hence, there exists a trade-off between the profit and cost in each radar. Mathematically, for radar i, the optimization problem can be expressed by: subject to : Therefore, the sub-problems P3.1 and P3.2 together form a Stackelberg game-theoretic model for the considered problem scenario. The purpose of the Stackelberg game is to find the SE points, which is studied in the following sub-section.

Analysis of the Proposed Stackelberg Game
For our formulated Stackelberg game model, the SE is defined as follows: Definition 1. The point (ϕ * , P * ) is the SE for the considered Stackelberg game model if for any (ϕ, P), the following conditions are satisfied [21,22]: where ϕ * denotes the optimal solution for P3.1, and P * denotes the optimal solution for P3.2.
Generally speaking, the SE point for a Stackelberg game model can be achieved by obtaining the NE for the follower-level sub-game [21]. It has been shown that the different radars in multistatic radar system compete in a non-cooperative Nash game. On the other hand, the best response function of the intercept receiver can be obtained by solving the sub-problem P3.1. To achieve the SE point, the best response functions for the different radars (followers) should be obtained first, while the intercept receiver (leader) derives its best response function based on those of the radars subsequently. Lemma 1. For a given price of power resource ϕ i , the optimal solution for sub-problem P3.2 can be expressed by: where {x} b a = max{min{x, b}, a}, and k denotes iteration index.
Proof. Taking the first derivative of u MRS,i (P i , P −i , ϕ i ) with respect to P i , we can obtain: Let ∂u i (p i ,p −i ) ∂p i = 0 and rearrange terms, we have: thus, Since , then we can obtain: Finally, the following equation can be obtained to achieve the NE through iterations: which completes the proof.

Proposition 1.
The non-cooperative Nash game model P3.2 has at least one NE.
Proof. According to [19,26], the conditions for the existence of NE are listed as follows: (i) P i is a non-null, convex and tight subset in a finite Euclidean space; (ii) u MRS,i (P i , P −i , ϕ i ) is continuous and quasi-concave with P i .
Each power element is limited between 0 and P max i , and P i is a convex tight subset. Thus, the condition (i) is easily met. For condition (ii), we take the second order derivative of u MRS,i (P i , P −i , ϕ i ) with respect to P i and obtain Thus, u MRS,i (P i , P −i , ϕ i ) is a concave function of P i . Both conditions (i) and (ii) hold. Therefore, there exists at least one NE in P3.2, which completes the NE existence proof.

Proposition 2.
The NE of the non-cooperative Nash game model P3.2 is unique.
Proof. In order to show that the NE of the game model P3.2 is unique, we need to prove that the ith radar's best response strategy function y(P i ) = γ min P i should be standard, which satisfies the following conditions [26]: (i) Positivity: For ∀i, y(P i ) > 0; (ii) Monotonicity: If P m i > P n i , then y(P m i ) > y(P n i ); (iii) Scalability: For ∀ζ > 1, ζy(P i ) > y(ζP i ).
For Condition (i), since for ∀i, it is obvious to obtain thus Condition (i) is satisfied. For Condition (ii), if P m i > P n i , we can obtain: where W 1 Since we have ∑ Thus: When γ min γ i · 2 P max i − η > W, we can obtain: y(P m i ) > y(P n i ). (25) In such a case, Condition (ii) is satisfied. For Condition (iii), where (27) Owing to ζ > 1, it is apparent that: Therefore, we can obtain: and Condition (iii) is satisfied. As a result, all the above conditions are met, which completes the NE uniqueness proof.

Distributed Approach for Calculating Stackelberg Equilibrium
Based on the above theoretical derivations and analyses, we develop a distributed iterative power control method to perform LPI performance optimization for multistatic radar system. As aforementioned, to achieve the SE point, the sub-problem P3.2 must be solved first for a given price of power resource ϕ i . Then, we solve the sub-problem P3.1 for the optimal price ϕ * i with the calculated transmit power P * i of multiple radars. The detailed steps of the pricing-based distributed iterative power control approach is provided in Algorithm 1, where (·) * is the SE solution, and ∆ϕ is the step size.

Algorithm 1: Pricing-Based Distributed Iterative Power Control Approach
Input: Set γ min , S min , P INT < ; 10 Output the final solutions;

Numerical Examples and Performance Evaluation
In this section, some numerical examples are presented to evaluate the performance of our proposed LPI optimization strategy. For ease of exposition, we consider a target detection scenario similar to Figure 1, which consists of a multistatic radar system, a target, and a intercept receiver. It is assumed that the multistatic radar system consists of Q R = 6 widely deployed radar nodes, which are located at (50, 0) km, (25,  In each time slot, each radar receives M = 512 pulses. The probabilities of target detection and false alarm are set as p D,i = 0.9973 (∀i) and p FA,i = 10 −6 (∀i), respectively. Thus, the corresponding detection threshold α i (∀i) and the SINR threshold γ min can be computed as 0.0267 and 10 dB, respectively. The other system parameters are given in Table 1.
To evaluate the influence of target reflectivity on the power allocation results, we consider two different target RCS models. In the first case, σ RCS,1 = [1, 1, 1, 1, 1, 1]m 2 . In the second case, σ RCS,2 = [0.5, 20, 5, 0.25, 16, 50]m 2 . On the other hand, in order to reveal the effect of system configuration on resource allocation results, we consider two different target locations, that is, [0, 0] km and [−30, −40] km. Figure 2 depicts the convergence behavior for the transmit power of each radar in the first case with different initial power allocation values P (1) = [3600, 800, 300, 1500, 600, 4000] W and P (1) = [100, 2300, 1500, 280, 4600, 900] W, respectively. Similarly, Figure 3 shows the convergence behavior for the transmit power of each radar in the second case with P (1) = [2500, 2500, 2500, 2500, 2500, 2500] W and P (1) = [1000, 1500, 1000, 1500, 1000, 1500] W, respectively. The proposed algorithm stops when u (k+1) INT < is within the desired accuracy. It can be observed that our presented Stackelberg game-based LPI performance optimization strategy converges fast to the unique NE points for all initial values of transmit power.    In Figures 4 and 5, we illustrate the convergence process for the transmit power allocation in both cases. Here, we define the ratio of transmit power as [21,22]. One can see from Figure 3b that the proposed LPI optimization scheme would like to assign more transmit power to Radar 1, Radar 2 and Radar 6, which are farther from the target. In addition, from Figure 3a, it can be noticed that more power resource is allocated to Radar 4 and Radar 1, whose RCS with respect to the target is much smaller than other radars. Hence, we can conclude that the transmit power allocation results depend on the system geometry between target and multistatic radar and target reflectivity [26]. Furthermore, as shown in Figure 3b, the proposed optimization algorithm tends to distribute less transmit power to the radars with larger path propagation gains. Figures 6 and 7 show the convergence behavior for the achieved SINR of each radar in both cases. As can be interpreted in these figures, by employing our proposed strategy, the achieved SINR value of each radar tends to converge to the predefined SINR threshold after 6-8 iterations, and thus the desired target detection requirement can be satisfied.        Moreover, the convergence performance for the normalized utility function of intercept receiver in both scenarios is illustrated in Figures 8 and 9. The results indicate that, as the number of iterations increase, the revenues of intercept receiver eventually converge to the SE points in both cases.     Figures 10 and 11 show the convergence behavior for the received power at intercept receiver. It is evident that the total transmit power received at intercept receiver from the multistatic radar system is below the sensitivity of interceptor S min . This is due to the fact that the proposed strategy can coordinate the radiated power from the radar transmissions through updating the unit power prices.
In such a case, the LPI performance of the multistatic radar system can be guaranteed by minimizing the power consumption of each radar. To conclude, our proposed Stackelberg game-theoretic LPI optimization scheme is effective to enhance the LPI performance of the multistatic radar system.

Conclusions
This paper studies the problem of Stackelberg game-theoretic LPI performance optimization in multistatic radar system, whose purpose is to minimize the radiated power of each radar for a specified target detection performance. A single-leader multi-follower Stackelberg game is established to systematically formulate the intercept receiver and the multiple radars' behaviors, where the hostile intercept receiver acts as a leader and the multiple radars are followers, respectively. The Stackelberg game model jointly investigates the utility maximization of the interceptor and multiple radars. Based on our theoretical findings on the existence and uniqueness of the NE in the game model, we present a pricing-based distributed iterative power control method, and its convergence to the NE is verified by numerical simulations. Moreover, the results of this work are useful to practically strengthen the LPI performance for target detection in multistatic radar system. In future work, the derivations and simulation results will be extended to the cooperative-theoretic case.
Author Contributions: C.S., W.Q. and F.W. conceived and designed the experiments; C.S., W.Q. and F.W. performed the experiments; S.S. and J.Z. analyzed the data; C.S. wrote the paper; S.S. and J.Z. contributed to data analysis revision; S.S. and J.Z. contributed to English language correction. All authors of article provided substantive comments.
Funding: This research has recieved no external funding.