Detecting a Photon-Number Splitting Attack in Decoy-State Measurement-Device-Independent Quantum Key Distribution via Statistical Hypothesis Testing

Measurement-device-independent quantum key distribution (MDI-QKD) is innately immune to all detection-side attacks. Due to the limitations of technology, most MDI-QKD protocols use weak coherent photon sources (WCPs), which may suffer from a photon-number splitting (PNS) attack from eavesdroppers. Therefore, the existing MDI-QKD protocols also need the decoy-state method, which can resist PNS attacks very well. However, the existing decoy-state methods do not attend to the existence of PNS attacks, and the secure keys are only generated by single-photon components. In fact, multiphoton pulses can also form secure keys if we can confirm that there is no PNS attack. For simplicity, we only analyze the weaker version of a PNS attack in which a legitimate user’s pulse count rate changes significantly after the attack. In this paper, under the null hypothesis of no PNS attack, we first determine whether there is an attack or not by retrieving the missing information of the existing decoy-state MDI-QKD protocols via statistical hypothesis testing, extract a normal distribution statistic, and provide a detection method and the corresponding Type I error probability. If the result is judged to be an attack, we use the existing decoy-state method to estimate the secure key rate. Otherwise, all pulses with the same basis leading to successful Bell state measurement (BSM) events including both single-photon pulses and multiphoton pulses can be used to generate secure keys, and we give the formula of the secure key rate in this case. Finally, based on actual experimental data from other literature, the associated experimental results (e.g., the significance level is 5%) show the correctness of our method.


Introduction
Quantum key distribution (QKD) [1][2][3][4][5][6] is a technique that allows two remote parties (Alice and Bob), to share unconditional secure keys. The unconditional security of the keys are guaranteed by the laws of quantum mechanics [7][8][9][10]. The first ideal QKD protocol is BB84-QKD created by Bennett and Brassard [1], which needs a perfect single-photon source and detectors. However, there is always a large gap between ideal and reality. Due to the imperfection of equipment, the implementation of the QKD suffers double attacks from the source side and detection side. On the one hand, at present, perfect single-photon sources are not available, and weak coherent photon sources (WCPs) after phase randomization are often utilized to replace the single-photon sources. While the photon number of the pulses emitted by WCPs may be more than one, an eavesdropper Eve can launch a photon-number splitting (PNS) attack [11][12][13][14][15]. Specially, a weaker version of a PNS attack is one in which Alice's or Bob's pulse count rate changes significantly after the attack [11][12][13][14], and the stronger PNS attack means that both Alice's and Bob's pulse count rates remain unchanged after the attack [15]. The difference between these two attacks is the effect on Alice's and Bob's pulse count rates. Fortunately, the decoy-state method [16][17][18] proposed later can resist PNS attacks very well.
On the other hand, due to the low detection efficiency of the detectors, Eve can launch attacks against the detectors. Compared with source attacks, there are more attacks from the detection side, such as the detector blinding attack [19,20], dead time attack [21], faked state attack [22,23], and time shift attack [24].People have proposed device-independent quantum key distribution (DI-QKD) [25,26], which can resist all attacks from devices. However, this protocol is highly impractical because it needs close to unity detection efficiency. In 2012, Lo et al. [27] proposed measurement-device-independent quantum key distribution (MDI-QKD), which is also known as the time-inversion version of EPR protocol [28]. In MDI-QKD, Alice and Bob do not need to perform measurement operations, so it can be innately immune to all detection attacks. MDI-QKD combined with the decoy-state method can resist both source attacks and detection attacks; thus, decoy-state MDI-QKD [29][30][31] is one of the most promising QKD protocols, which can provide unconditional secure keys in practical applications.
However, the secure key rate of the existing decoy-state MDI-QKD is not high [32,33]. The decoy-state method defeats the PNS attack through providing a more accurate method to determine the secure key rate. More specifically, the existing decoy-state method can more closely estimate the lower bound of gain and the upper bound of quantum bit error rate (QBER) of single-photon signals, and then the secure key rate can be calculated by the GLLP formula [34]. In essence, the existing decoy-state method does not care about the existence of a PNS attack, and the secure keys are only generated by single-photon components [35]. However, if we can determine that there is no PNS attack on the channel, multiphoton pulses can also generate secure keys. For simplicity, we only analyze the weaker version of PNS attack in which the legitimate user's pulse count rate changes significantly after the attack. In this case, there is no doubt that using the existing methods to estimate the secure key rate will waste the underlying keys generated from multiphoton pulses and reduce the efficiency.
In this work, under the null hypothesis of no PNS attack H 0 , we first retrieve the lost information in the existing decoy-state MDI-QKD, extract a normal distribution statistic, and provide a new method to determine whether there is a PNS attack or not through statistical hypothesis testing. If the result is judged to be an attack, the keys can only be generated from single-photon pulses, and the secure key rate will be estimated by the existing decoy-state method. Otherwise, all pulses with the same basis leading to a successful Bell state measurement (BSM) event including both single-photon pulses and multiphoton pulses can be used to generate keys, and we give the formula of the secure key rate in this case. Furthermore, we use the real experimental data in [36] to verify our method, and the analytical results show that our method is credible (e.g., a significance level of 5%).
The structure of this paper is organized as follows. In Section 2, we briefly review the typical decoy-state MDI-QKD and related notations. In Section 3, we describe our method for detecting the PNS attack in the decoy-state MDI-QKD via statistical hypothesis testing in detail. In Section 4, the correctness of our method is verified with the real experimental data from the existing literature. Finally, we discuss and draw conclusions in Section 5.

Three-Intensity Decoy-State MDI-QKD
In this paper, we adopt a typical decoy-state MDI-QKD with polarization encoding [36], which mainly consists of three steps.
(i) Alice generates phase-randomized pulses from WCPs and randomly selects the basis W ∈ {Z, X}. That is, P Z = P X = 1/2, where P Z and P X are the probabilities of choosing the Z basis and X basis, respectively. Then Alice uses an intensity modulator to modulate the pulses with three different intensities and sends them to Charlie located in the middle. This three intensities are the intensity of signal state µ 2 , the intensity of decoy state µ 1 , and the intensity of vacuum state µ 0 , respectively. Furthermore, the corresponding percentages being emitted are P µ 2 , P µ 1 , and P µ 0 , respectively. Obviously, P µ 2 + P µ 1 + P µ 0 = 1. At the same time, Bob performs the same procedures as Alice, and the intensities of Bob's pulses are noted as ν 2 , ν 1 , and ν 0 for the signal state, decoy state, and vacuum state, respectively. Similarly, the corresponding percentages being emitted are P ν 2 , P ν 1 , and P ν 0 , respectively, where P ν 2 + P ν 1 + P ν 0 = 1.
(ii) The pulses from Alice and Bob interfere when they reach Charlie. Then Charlie performs a Bell state measurement (BSM) on the interference outcomes and announces the measurement results to Alice and Bob.
(iii) Alice and Bob compare their bases, and determine the secure keys through Charlie's measurement results. Specifically, if Alice and Bob choose the same basis and Charlie has a successful BSM event at the same time, then this part of the pulses can generate keys. It is important to emphasize that the secure keys are only generated from the signal state with Z basis, and the others are used for parameter estimation.
The secure key rate of the decoy-state MDI-QKD [27,36] is given by In the above equation, q = P 2 Z P µ 2 P ν 2 is the probability that Alice and Bob both select the Z basis and both modulate the pulse as signal state. P µ 2 ν 2 11 = µ 2 ν 2 e −µ 2 −ν 2 is the probability that the pulses from Alice's signal state and Bob's signal state are both single-photon pulses. Y Z 11 and e X 11 are the yield of single-photon state with Z basis and the quantum bit error rate (QBER) of single-photon state with X basis.
is the binary Shannon entropy function. Q Z µ 2 ν 2 and E Z µ 2 ν 2 are the overall gain and overall QBER of signal state with Z basis, respectively. f e > 1 is the error correction efficiency.
According to [37,38], the overall gain Q W µ k ν l (W ∈ {X, Z}) and the overall QBER E W can be obtained by the following equations, where In the above equations, µ k and ν l , k, l ∈ {0, 1, 2}, are the intensities of pulses emitted by Alice and Bob, respectively. I 0 (x) is the modified Bessel function of the first kind. e 0 is the error rate of background. e d is the misalignment-error probability. p d is the dark count rate. η a and η b are the transmission efficiencies of Alice and Bob, respectively. In addition, where η d is the quantum efficiency of detectors, δ is the loss coefficient measured in dB/km, L ac (L bc ) is the distance in km from Alice (Bob) to Charlie, and θ is the insertion loss in Charlie's measurement setup in dB. Without Eve's intervention, based on Equations (2)-(4), the yield and the QBER of single-photon pulses when Alice and Bob select the same basis X or Z are, respectively, given by

Statistical Hypothesis Testing
In this section, we introduce a new method to detect the PNS attack in the decoy-state MDI-QKD via statistical hypothesis testing. It is important to emphasize that the PNS attacks mentioned here and below refer to the weaker version of PNS attack. Then we analyze the Type I error of the test; that is, mistaking no PNS attack when there is a PNS attack. Generally speaking, our method first puts forward a null hypothesis and alternative hypothesis based on the theory of statistical hypothesis testing. Then, the test statistic is constructed according to the null hypothesis and other conditions. Furthermore, the specific values of the statistics can be obtained by using the parameters and experimental data. After the significance level is given, we can infer whether there is PNS attack in the channel with a certain probability. The details are as follows.
(i) Identify null and alternative hypothesis. Let us consider the hypothesis testing problem of the null hypothesis H 0 : there is no PNS attack on the channel and the alternative hypothesis H 1 : there is a PNS attack on the channel.
(ii) Construct the test statistic. We need a test statistic to conduct the hypothesis testing. In what follows, the distribution of the test statistic is derived under the null hypothesis H 0 . Let us further consider Alice's and Bob's pulses emission process and Charlie's BSM event. When Alice and Bob send pulses with the same basis, the BSM event outcomes at Charlie only include two cases, successful or failed. Therefore, the above process can be regarded as a Bernoulli trial. Note that Q W µ k ν l is the probability that Charlie obtains a successful BSM event provided that Alice and Bob emit pulses with the intensities µ k and ν l and select the basis W. Suppose the total number of pulses emitted by Alice (Bob) is N data , then the number of pulses is P 2 W P µ k ν l N data when Alice's and Bob's intensities with W basis are µ k and ν l , respectively. In the above equation, P W is the probability that Alice (Bob) chooses the W ∈ {X, Z} basis, P µ k ν l = P µ k P ν l is the probability that Alice and Bob choose the intensities µ k and ν l , respectively. At this point, the number of successful BSM events that Charlie obtained is denoted as n W µ k ν l . Then, n W µ k ν l has the binomial distribution with parameters (P 2 W P µ k ν l N data , Q W µ k ν l ), for short, According to [36], we find N data is so large (typically 10 10 ∼ 10 11 ), Q W µ k ν l is close to 10 −8 ∼ 10 −5 . Generally, the selections of basis and intensity are random. In other words, P Z = P X = 1/2, P µ k = P ν l = 1/3 where k, l ∈ {0, 1, 2}. Thus, we have P 2 By the law of large numbers and the central limit theorem, when P 2 W P µ k ν l N data Q W µ k ν l ≥ 5 and P 2 W P µ k ν l N data (1 − Q W µ k ν l ) ≥ 5, the binomial distribution with parameters (P 2 W P µ k ν l N data , Q W µ k ν l ) can be approximately regarded as the normal distribution with mean P 2 W P µ k ν l N data Q W µ k ν l and variance P 2 W P µ k ν l N data (1 − Q W µ k ν l ), given by After standardization, we obtain a random variable U W µ k ν l , which obeys the standard normal distribution; that is, Considering the additivity of normal distribution, we obtain a random variable involving all possibilities of U W µ k ν l where W ∈ {X, Z}, k, l ∈ {0, 1, 2}, which also obeys the normal distribution. There are eighteen cases of U W µ k ν l considering that the pair of intensity is nine cases and the selection of basis is two cases. Note that we only consider the same basis for Alice and Bob, that is, both Z basis or both X basis. After standardization, we obtain a new random variable V that obeys the standard normal distribution, which can be written as Furthermore, Φ(v) is the distribution function of V, given by where v is the value of V and is just the test statistic that we find.
(iii) Find the value of the test statistic. We set the parameters N data , e d , e 0 , p d , L ac , L bc , δ, θ, P Z , P X , µ k , ν l , P µ k , and P ν l , where k, l ∈ {0, 1, 2}, and we calculate the theoretical value of Q W µ k ν l according to Equations (2)-(4). We record n W µ k ν l where k, l ∈ {0, 1, 2}, W ∈ {X, Z}. We substitute the above data into Equation (9) and obtain the value of the test statistic v.
(iv) Choose a significance level. We need to determine a significance level α (typically 0.05) for the test. In terms of the null hypothesis H 0 of the test, we deduce that the test is a two-tailed hypothesis testing. Given α, the rejection region is |vs.| > v [1−α/2] where v [1−α/2] can be obtained by Equation (10). More precisely, the variables −v [1−α/2] and v [1−α/2] refer to the boundary values between the rejection region and the acceptance region for the test. Let the left side of Equation (10) be equal to α/2; the upper limit of the integral will be −v [1−α/2] . According to the symmetry of the probability density function of normal distribution, v [1−α/2] can be obtained.
(v) Make a decision. Compare the test statistic v with the critical values v [1−α/2] and −v [1−α/2] . If v > v [1−α/2] or v < −v [1−α/2] , we will reject H 0 and accept H 1 . This means that we believe there is a PNS attack on the channel. Otherwise, we fail to reject H 0 . That is to say, we consider there is no PNS attack on the channel. Note that the significance level of the test α is just the Type I error probability of the test, namely, the probability of mistaking no PNS attack for having a PNS attack. Let β denote the Type II error probability of the test, to be precise, the probability of mistaking having a PNS attack for no PNS attack. Note that β is usually difficult to solve in most situations. Furthermore, determining the value of β requires more information about the aggression behavior.
If the result is judged to be a PNS attack, the secure key rate in this case can be estimated by Equation (1). Otherwise, all pulses with the Z basis leading to a successful BSM event including both single-photon pulses and multiphoton pulses can be used to generate the keys. Furthermore, the secure key rate formula Equation (1) becomes By comparing Equation (11) with Equation (1), we can easily find the secure key rate has been highly improved when the judgment result is no PNS attack. This is mainly due to the contribution of multiphoton components.

Results and Analysis
In the preceding section, we showed the details of our detection method. Now, we move forward to the corresponding experiments based on the aforementioned method and analyze the experimental results. Generally speaking, the real experimental data were substituted into the formulas in Section 3 to verify the correctness of our method. The experimental parameters were from real experiments [36]. Specially, the experimenters in [36] adopted a symmetric scheme; that is, all parameters of Alice and Bob were identical and optimized. The relevant experimental parameters used in [36] and this paper are shown in Table 1.  Based on the above parameters, we can obtain the values of Q W µ k ν l , as shown in Table 2. Note that Table 2 in this paper is exactly the same as Table I in the Supplementary Materials of [36]. We record the values of n W µ k ν l , as shown in Table 3. Note that the data in Table 3 can be deduced from Table I in the Main Text of [36]. According to the above data and Equation (8), all values of U W µ k ν l can be obtained, as shown in Table 4.  Table 3. The values of n W µ k ν l (×10 4 ) with intensities µ k ∈ {µ 2 , µ 1 , µ 0 } and ν l ∈ {ν 2 , ν 1 , ν 0 } based on W ∈ {X, Z}. The schematic diagram of statistical hypothesis testing is illustrated in Figure 1. After calculation, we obtained the value of the test statistic v = 0.236. Given the significance level of the test α = 0.05, the critical values were v [1−α/2] = 1.96 and −v [1−α/2] = −1.96. Since −1.96 < 0.236 < 1.96, the test statistic did not fall inside the rejection region, and we failed to reject H 0 . In other words, we inferred that there was no PNS attack on the channel, and the corresponding Type I error probability was less than 5%. According to [36], there was indeed no PNS attack in the experiment, which verifies the correctness of our method. Thus, both single-photon and multiphoton components can be used to generate keys in this case. At this time, the secure key rate can be estimated through Equation (11).

Conclusions and Discussion
In summary, we first recovered the lost information of the existing decoy-state method when detecting the weaker version of a PNS attack in the decoy-state MDI-QKD and extracted a normal distribution statistic via statistical hypothesis testing. Based on this information, we proposed a new method to detect the weaker version of a PNS attack. Most importantly, the error probability of detection was precisely calculated by our method, and we also gave the calculation. Finally, according to the judgment result, the corresponding secure key rate was provided. In particular, compared with the existing decoy-state MDI-QKD protocols, the secure key rate with our method has been highly improved if the judgment result is no weak PNS attack. Meanwhile, the associated experimental results also verified the correctness of our method.
Nevertheless, all judgment results in this paper were obtained under the condition that the null hypothesis was no weak version of a PNS attack. In other words, we assume that the gain of signal or decoy state will change significantly after the PNS attack. However, we can do nothing about the stronger PNS attack, which retains the gain of signal and decoy state, such as a partial PNS attack [15], because the premise of the derivation no longer holds, and the Type II error probability of our method in this case will be poor even close to unity. For this reason, compared with the existing decoy-state method [29][30][31] to directly estimate the secure key rate, our method is not ready for practical application now; however, we provide a new direction to improve the secure key rate and efficiency.