Protecting Physical Layer Secret Key Generation from Active Attacks

Lightweight session key agreement schemes are expected to play a central role in building Internet of things (IoT) security in sixth-generation (6G) networks. A well-established approach deriving from the physical layer is a secret key generation (SKG) from shared randomness (in the form of wireless fading coefficients). However, although practical, SKG schemes have been shown to be vulnerable to active attacks over the initial “advantage distillation” phase, throughout which estimates of the fading coefficients are obtained at the legitimate users. In fact, by injecting carefully designed signals during this phase, a man-in-the-middle (MiM) attack could manipulate and control part of the reconciled bits and thus render SKG vulnerable to brute force attacks. Alternatively, a denial of service attack can be mounted by a reactive jammer. In this paper, we investigate the impact of injection and jamming attacks during the advantage distillation in a multiple-input–multiple-output (MIMO) system. First, we show that a MiM attack can be mounted as long as the attacker has one extra antenna with respect to the legitimate users, and we propose a pilot randomization scheme that allows the legitimate users to successfully reduce the injection attack to a less harmful jamming attack. Secondly, by taking a game-theoretic approach we evaluate the optimal strategies available to the legitimate users in the presence of reactive jammers.


Introduction
The increasing interest in physical layer security (PLS) has been stimulated by many practical needs, particularly in the context of Internet of things (IoT) applications [1]. For example, in [2,3], secret key generation (SKG) from wireless fading coefficients was analyzed, showing its potential as a lightweight alternative to standard security schemes. In fact, the SKG scheme allows two legitimate parties (Alice and Bob) to extract on-thefly secret keys, without the need for significant infrastructure. Furthermore, it has been information-theoretically proven that by following the SKG process, Alice and Bob can extract a shared secret over unauthenticated channels [4][5][6]. Building on that, numerous practical experiments have demonstrated the feasibility of the scheme [7,8]. Moreover, it has been shown that SKG can be combined with authenticated encryption (AE) schemes [9,10] in order to overcome trivial man-in-the-middle (MiM) attacks, similarly to known MiM attacks on unauthenticated Diffie-Hellman schemes.
The success of the SKG scheme relies on the reciprocity and variability of wireless channels. On the one hand, the reciprocity property allows both Alice and Bob to measure an identical channel impulse response during the coherence time of the channel [11][12][13], while on the other hand, the variability property of the wireless channel directly affects the key generation rates [14][15][16][17].
However, the exchange of pilots during the channel estimation phase between Alice and Bob could allow an adversary (Mallory) to estimate the channels Alice-Mallory and Bob-Mallory. Having this information, Mallory could inject suitably precoded signals during the SKG process and could potentially control a significant part of the reconciled sequence while remaining undetected. To overcome this, instead of transmitting publicly known pilot signals, we propose a two-way randomized pilot transmission between Alice and Bob. An earlier work studied this problem for an orthogonal frequency-division multiplexing (OFDM) system [18]. Here, we investigate the scenario of a multiple-inputmultiple-output (MIMO) system. We prove that if Mallory has one extra antenna with respect to Alice and Bob, she could always launch an injection attack. Next, through theoretical analysis, we show that the proposed pilot randomization scheme successfully reduces an injection attack to a less harmful uncorrelated jamming attack, ensuring that the extracted key bits are secret from both active and passive adversaries.
In the second part of this paper, we delve deeper into jamming attacks over MIMO systems. In particular, we focus on denial of service (DoS) in the form of reactive jamming. We derive the optimal strategies for both the attacker and the legitimate users. Through numerical evaluation, we demonstrate that, depending on their capabilities, reactive jammers could provoke legitimate users to transmit at full power in order to achieve a positive SKG rate.

System Model
In this work, we consider a time-division duplex MIMO (TDD-MIMO) system consisting of two legitimate nodes and an active adversary, namely, Alice, Bob, and Mallory, respectively. On the one hand, Alice and Bob are generating secret keys using the wireless SKG procedure, while on the other hand, Mallory performs an injection attack on the MIMO links Mallory-Alice and Mallory-Bob. The number of antennas at Alice N A and Bob N B are assumed to be equal, i.e., N A = N B = N. To better illustrate the considered scenario, we give a brief overview of the SKG procedure, and show how an injection attack could affect the process.

Secret Key Generation from Fading Coefficients
As illustrated in Figure 1, the standard SKG procedure consists of three phases [19]: (1) advantage distillation: the legitimate nodes exchange pilot signals, each using N transmit and N receive antenna elements, in order to estimate their reciprocal channel state information (CSI).
where H represents the channel matrix of size N r × N t = N × N such that its (i, j) entry represents the channel linking the i-th receive antenna, and the j-th transmit antenna, z represents the received vector of length N r , x denotes the transmitted vector consisting of N t = N r = N elements, n A and n B are the received noise vectors at Alice and Bob, each of length N r , respectively. Note that, due to the reciprocity of the wireless channel, Alice and Bob observe H and H T , respectively. To conclude this step, z A and z B are passed through suitable quantizers [20], generating binary vectors r A and r B , respectively; (2) information reconciliation: discrepancies, due to imperfect channel estimation in the quantizer local outputs, are reconciled through a public exchange of helper data s A (see Figure 1), e.g., by using Slepian-Wolf reconciliation techniques [10,21]; (3) privacy amplification: the legitimate nodes apply universal hash functions to the reconciled information r A and obtain key k. This step ensures that the generated key k is uniformly distributed and completely unpredictable by an adversary.
During the process above, an eavesdropping adversary could obtain channel observations, given as follows: where the channel matrices in the links Alice-Mallory and Bob-Mallory are denoted by H AM and by H BM , respectively, while the received noise vectors are demoted by n AM and n BM . Afterward, the SKG capacity between Alice and Bob is expressed as the conditional mutual information between the observations of Alice, Bob, and Mallory.

Injection Attacks during SKG
One of the most critical threats to the SKG model, given in Figure 1, is MiM in the form of an injection attack [11,22,23]. The main components of the injection attack are captured in Figure 2. While, the legitimate nodes Alice and Bob exchange pilot signals during the advantage distillation phase, Mallory injects signals p. Based on the results in [22], we assume that Mallory has perfect knowledge of the channel vectors in the MIMO links Mallory-Alice, H MA = H T AM and Mallory-Bob, H MB = H T BM . This is a reasonable assumption since Mallory can estimate the channel vectors while Alice and Bob exchange pilot signals, as long as the channel's coherence time is respected (a plausible scenario in slow-fading, low-mobility environments). Finally, Mallory chooses the vector p such that the same signal is "injected" at both Alice and Bob, i.e., H MA p = H MB p.

Analysis of Injection Attacks in MIMO SKG
In this section, we first prove that if Mallory has one extra antenna, with respect to Alice and Bob, she could always launch an injection attack. Next, we propose a pilot randomization scheme and show that when employed, legitimate users could successfully reduce the attack to a jamming attack. Lemma 1. While Alice and Bob perform advantage distillation using N antennas, Mallory could always launch an injection attack, as long as she has at least N + 1 antennas.
Proof. The precoding vector of Mallory p of size (N + 1) × 1 is represented as The channel matrices H MA and H MB have size N × (N + 1), such that and Next, we can represent the equation where H M = H MA − H MB is equal to: Given the above, Equation (10) can be rewritten as H M p = 0, where H M is given in Equation (11). The equality H M p = 0 is equivalent to solving the following linear system of equations: Due to the fact that Mallory has an additional degree of freedom (one extra antenna), as compared to Alice and Bob, she can treat one of the elements in p as a constant and solve for the others in terms of it. Based on this, we let p N+1 be a constant and rewrite the system in (12) as The system of equations in (13) can be represented as Finally, since det(A) = 0 almost surely, (i.e., under the assumptions in Section 2, det(A) is a continuous random variable, hence det(A) = 0 with probability 1) and therefore the system's solution is unique and given by Note that if Mallory has the same number of antennas as Alice and Bob, she will not have one extra degree of freedom and the transition from the system in Equation (12) to the system in Equation (13) would not be possible. However, as shown here, if Mallory has one extra antenna, with respect to Alice and Bob, she can treat one of the elements in p as constant, which allows her to find the rest of the elements as in Equation (14). This concludes the proof of Lemma 1.
Based on Lemma 1, the observations of Alice and Bob are now given by where w = H MA p = H MB p denotes the observed injected signals at Alice and Bob, which are identical due to the precoding vector p. By injecting w, Mallory controls the secret key rate, which is now upper bounded by [18,24] L ≤ I(z A , z B ; w).

Pilot Randomization as a Countermeasure to Injection Attacks
It has been shown that a countermeasure to injection attacks can be built by randomizing the pilot sequence exchanged between Alice and Bob [18,23,24]. In this work, we propose a MIMO pilot randomization scheme in which pilots are drawn from a (scaled) QPSK modulation. Specifically, Alice and Bob do not transmit the same pilot signal x; instead, they transmit independent, random pilot signals x and y drawn from i.
Finally, to generate shared randomness, Alice and Bob post-multiply z A and z B by their own randomized pilot signals, such asz A = x T z A andz B = y T z B (unobservable by Mallory). Given this, the modified observations are expressed as where the shared randomness between Alice and Bob is now represented by x T Hy = xH T y T . Furthermore, the independence of x and y ensures the following:

Jamming Attacks on SKG
In this section, we focus on reactive jamming attacks in SKG systems and examine the scenario in which Mallory reactively jams Alice (note that the scenario in which Mallory jams Bob is identical). A reactive jamming attack is an intelligent approach in which the jammer initially senses the spectrum and jams only if a transmission is detected. Due to the difficulty to be detected, reactive jamming attacks are considered to be a great threat to legitimate transmission [25,26]. Next, we assume that Alice and Bob perform SKG in a TDD-MIMO system with a spatially uncorrelated channel. It has been proven that the optimal power strategy for Alice and Bob in this scenario is to employ equal power distribution [27], which is also assumed for this study, i.e., In the following, we assume that Mallory has N antennas, and as a reactive jammer, she senses the spectrum and jams in the link Mallory-Alice only if she detects a power greater than a certain threshold p th . Thus, instead of considering Mallory's power allocation matrix, we work with the sum jamming power for all antennas, which can be represented as a power allocation vector γ = (γ 1 , . . . , γ N ). By denoting the available jamming power by NΓ, the following short-term power constraint is considered: Assuming that H is uncorrelated with H AM , H BM and that all channel matrices have independent and identically distributed elements that are drawn from circularly symmetric zero-mean Gaussian distributions of variances σ 2 and σ 2 J , respectively, then the SKG capacity can be expressed as [27]

Optimal Power Allocation Strategies
In the following, we take a game-theoretic approach in order to evaluate the optimal strategies of Alice, Bob and Mallory. Throughout the following Alice and Bob's common objective is to maximize C K (p, γ) with respect to (w.r.t.) p, while Mallory wants to minimize C K (p, γ) w.r.t. γ. Due to the reversed objectives, we formulated a noncooperative zero-sum game, which studies the strategic interaction between the legitimate users and the jammer: , γ)). The game G has three components: (i) there are two players, namely, L, denoting the legitimate users (Alice and Bob act as a single player), and J being the jammer (Mallory); (ii) player L has a set of possible actions A L = [0, P], while player J's set of actions is Lastly, C K (p, γ) denotes the payoff function of player L. Given the fact that player J is a reactive jammer, i.e, first observes the transmit power of player L and subsequently chooses a strategy, we study a hierarchical game in which player L is the leader, and player J is the follower. In this game, the solution is the Stackelberg equilibrium (SE)-rather than Nash-and it is defined as a strategy profile (p SE , γ SE ) where player L chooses their optimal strategy first, by anticipating the strategic reaction of player J (i.e., its best response). This is expressed as: p SE arg max p∈A L C K (p, γ * (p)), and γ SE γ * (p SE ), (27) where γ * (p) defines the best response (BR) of player J to any strategy p ∈ A L chosen by player L, and it is defined as follows: Finally, based on the detection capabilities at player L, two scenarios are considered: (i) when the detection threshold p th is fixed (defined by the sensing capability of Mallory's receiver); (ii) when p th is part of player L's strategy and could vary.

Stackelberg Equilibrium with Fixed Detection Threshold
In this section, we evaluate SE, when player J's detection threshold p th is predefined and constant. Note that the case P ≤ p th is trivial as γ SE = (0, . . . , 0), and the legitimate users will optimally use their maximum available power, i.e., (p SE = P). Indeed, due to the poorly chosen threshold p th or low sensing capabilities of Mallory, the legitimate transmission will not be detected and therefore will not be jammed. In the following, we assume that P > p th .

Lemma 2.
The BR of player J for any p ∈ A L chosen by player L defined in (28) is the uniform power allocation, given as Proof. Note that C K (p, γ i ) is a monotonically decreasing convex function w.r.t γ i , i = 1, . . . , N for any p > 0. Based on the principles of convexity in order to minimize C K , Mallory has to transmit with full power from all antennas. The detailed proof can be found in [18].
Based on the result from Lemma 1, the SKG rate can have the following two forms: which simplifies the players' options.
Proof. Given the BR of player J defined in (29), the legitimate users want to identify their optimal p ∈ A L that maximizes Given the fact that C K (p, γ) is monotonically increasing with p for fixed γ, two cases are distinguished: (a) p ∈ [0, p th ], (b) p ∈ (p th , P]. The optimal p in each case is given by

From (a) and (b), it can be concluded that the overall solution is
To simplify the above possibilities, we focus on the case when the utility function C K (P, Γ), i.e., being detected and jammed, equals the utility function when player L is transmitting at threshold p th (player J is silent), i.e., C K (P, Γ) = C K (p th , 0). Using this equality, by substituting appropriately into (25), we obtain a quadratic equation in P.

Stackelberg Equilibrium with Strategic p th
Finally, we investigate the case when Mallory could optimally adjust p th and show how her choice impacts Alice's and Bob's strategies. Allowing p th to vary modifies the game under study as followsĜ = ({L, J}, {A L ,Â J (p)}, C K (p, γ, p th )), wherê The BR of the jammer can then be defined as ( γ * (p), p th * (p)) arg min (γ,p th )∈Â J (p) Lemma 3. Mallory's BR in this scenario is a set of strategies as follows: Proof. The problem that the jammer wants to solve is min (γ,p th )∈Â J (p) C K (p, γ, p th ), which can be split as follows: The solution of the inner minimization is known from (29). For the outer problem, we have to find the optimal p th ≥ 0 that minimizes C K (p, γ * (p), p th ). Given that and that C K (p, Γ, p th ) < C K (p, 0, p th ), player J can optimally choose any p th such that p th = , ∀ < p. This allows the jammer to detect any ongoing transmission and to perform a jamming attack.
Proof. Given Mallory's BR, we evaluate the SE of the gameĜ. The definition for p SE is given as follows: Since Mallory will act as in (35), we have and the fact that C K (p, Γ, ) is monotonically increasing with p results in p SE = P. Figure 4 illustrates the achievable SKG rate when p th is part of player J's strategy. As in Figure 3, the parameters are chosen as Γ = 3, N = 10 and σ 2 J = 1. It can be seen that due to a strategically chosen threshold from player J the legitimate users have no other choice but to transmit at full power p = P = p SE . In fact, if the legitimate users deviate from the SE strategy and transmit with low power p = p th , player J could successfully disrupt their SKG process and decrease their achievable SKG rate by up to 97%.

Conclusions
In this study, injection and reactive jamming attacks were analyzed in MIMO SKG systems. With respect to injection attacks, the study demonstrated that a trivial advantage in the form of one extra antenna allows a MiM to mount such an attack. As a countermeasure, we showed that a pilot randomization scheme can successfully reduce injection attacks to jamming attacks. With respect to jamming attacks, using a game-theoretic approach, we showed that an intelligent reactive jammer should optimally jam with full power when a transmission is sensed. Finally, by strategically choosing her jamming threshold, i.e., just below the power level used by the legitimate users, Mallory could perform a much more effective attack. In fact, our theoretical analysis suggests that in this case, Alice and Bob have no choice but to use their full power available for SKG. An important topic for further research in this area is an examination of these initial findings in practical scenarios.