Secure Continuous-Variable Quantum Key Distribution with Machine Learning

: Quantum key distribution (QKD) offers information-theoretical security, while real systems are thought not to promise practical security effectively. In the practical continuous-variable (CV) QKD system, the deviations between realistic devices and idealized models might introduce vulnerabilities for eavesdroppers and stressors for two parties. However, the common quantum hacking strategies and countermeasures inevitably increase the complexity of practical CV systems. Machine-learning techniques are utilized to explore how to perceive practical imperfections. Here, we review recent works on secure CVQKD systems with machine learning, where the methods for detections and attacks were studied.


Introduction
Quantum key distribution (QKD) is an unconditionally secure quantum communication technology to transmit secure keys between the authorized sender (Alice) and receiver (Bob). Its information-theoretical security is harnessed by some fundamental laws of quantum mechanics such as the Heisenberg uncertainty principle, the quantum no-cloning theorem, and the association and nonlocality of particle entanglement [1]. QKD is divided into two classes, namely discrete-variable QKD [2,3] and continuous-variable (CV) QKD [4,5]. CVQKD has a series of theoretical, experimental, and implemental achievements promising a higher key and simpler and safer detecting techniques [6]. The Gaussian-modulated coherent state (GMCS) protocol is one of the most favorable CVQKD schemes to date, which in theory has been proven secure against arbitrary collective attacks and coherent attacks based on some basic assumptions [7].
However, a new challenge is the practical security proof for real CV systems in a more rigorous manner. In practice, realistic devices are not modeled as absolutely secure and perfect. There are security loopholes exposed by the imperfect devices for Eve to successfully steal secret key information, which is an effective quantum hacking strategy. These various attacks are against various components such as sources, detectors, and channels. For example, in practical CV systems, Eve may try to exploit the imperfections to launch local oscillator (LO) fluctuation attacks [8], wavelength attacks [9,10], and calibration attacks [11] related to LO signals; saturation attacks [12] and homodyne-detector-blinding attacks [13] in imperfections of homodyne detectors. In the last two decades, the rise of quantum hacking strategies has gone hand in hand with miscellaneous landmarks such as the maximum possible distance [14,15] and the standardization of QKD [16]; a review on the subject is given in [17].
To prevent or catch such attacks, many approaches have been proposed [17]. The first is to patch existent protocols, such as addition or modification in hardware [18,19]. The second is to devise novel QKD schemes, such as device-independent QKD [20] and measurement-device-independent QKD [21]. A more universal countermeasure consists of placing additional optical devices on the system. A wavelength filter is effective against the wavelength attack, and a proper monitor at detections may counter the homodynedetector-blinding attack. However, this approach is limited by our understanding of the devices and complete knowledge of attacks. Although a security patch could defeat a certain type of attack, patched countermeasures themselves might open other loopholes and add burden to the system. Besides, the performance of the traditional approaches is difficult to quantify.
Machine learning, one of the most swiftly developed interdisciplinary concepts, has been a powerful tool for face recognition, autonomous driving, medical imaging, and so on. In recent years, the cross-cutting studies of machine learning and quantum communication [22], quantum computation [23], and quantum optics [24], have become a brand-new idea and paradigm after theories, experiments, and computational simulations. In 2018, Reference [25] used support vector regression to predict parameters in practical CV systems for the first time, which effectively replaces extra monitoring devices. Many teams respectively proposed that different neural networks and random forest models be used for real-time parameter optimization [26][27][28]. Besides, in [29], a distance-weighted k-nearest-neighbors-based machine-learning detector was proposed to directly deal with the raw secret key, but not the system parameters, which is a new idea for improving the performance of CV systems. Reference [30] employed a backpropagation neural network to adjust the modulation variance to an optimal value and to furnish a higher achievable key rate and a more efficient parameter optimization than the local search algorithm in the practical four-state CVQKD system.
Based on these schemes, the efficient performance of machine learning for secure communication in the CVQKD system has been confirmed. Many researchers are aware that machine-learning techniques would have huge advantages in attack and defense in practical CV systems. In this paper, we review recent works on secure CVQKD with machine learning, focusing mainly on the practical imperfections. We briefly introduce the most commonly used distribution protocol flow of CVQKD and security analysis in Section 2. In Section 3.1, we review countermeasures that utilize machine-learning techniques to perceive a certain typical attack and multiple typical attacks, respectively. Then, we turn to improving the success rate of quantum hacking attacks fueled by machinelearning advances in Section 3.2. Finally, we give a brief conclusion in Section 4.

Background
This section briefly introduces the protocol and security in continuous-variable quantum information theory.

Protocol
CVQKD protocols can be divided into several types depending on the prepared state: the modulation schemes, the detection strategies, etc. More protocols can be found in an earlier review [7]. We describe the distribution protocol flow of the CVQKD protocol based on the most commonly used GG02 protocol [4]. The GG02 protocol is a Gaussian modulation protocol based on the coherent state rather than the squeezed state.
Alice prepares 2N random variables {x i } 1≤i≤N , {y i } 1≤i≤N , where the random numbers follow the Gaussian distribution with variance V A .
Alice sends the N coherent states distributed in phase space coordinates {(x i , y i )} 1≤i≤N to Bob through the untrusted quantum channel.
Bob first prepares the corresponding N random binary variables {b i } 1≤i≤N . The prepared {b i } 1≤i≤N and the homodyne detector are used to achieve random measurement of the X and P regular components of the quantum state sent by Alice. Then, Bob obtains N corresponding to {S i } 1≤i≤N .
Bob sends his modulation of N random binary variables {S i } 1≤i≤N to Alice. After Alice obtains the measurement-based selection information, the selected measurementbased information is matched and filtered with her modulation information. Alice and Bob complete the quantum state preparation, sending, and measurement process. The legal parties simultaneously obtain N pairs of raw keys {(x i· , y i· )} 1≤i≤N .
Alice and Bob perform data postprocessing on shared random variables such as parameter estimation, data negotiation, error correction, privacy amplification, and so on.

Security Analysis
This paper mainly considered the security of the CVQKD protocol under the collective attack. Eve prepares N auxiliary states, performs individual operations on the quantum states sent by Alice to the channel, and stores the output states in the quantum memory. Eve monitors the channel, and after the legal parties complete the classical negotiation and privacy amplification, Eve performs the joint measurement on the saved states.
We performed the secure information under the collective attack in the asymptotic case of reverse reconciliation, without the finite length effect [31]. The formula for the security key rate is: where I AB is the Shannon mutual information between Alice and Bob and χ BE is the Holevo bound [32] for Eve's accessible information. I AB is expressed as: χ BE is expressed as: where G(x) = (x + 1) log 2 (x + 1) − x log 2 x. λ 1,2 are the symplectic eigenvalues given by: with: λ 3,4 are the symplectic eigenvalues given by: with: The last symplectic eigenvalue λ 5 = 1. The experimental parameters of the system involved in the above formulas include: χ tot = χ line + (χ h /T) is the total noise referring to the channel input, where T is the quantum channel transmittance. χ h = [(1 − η) + v el ]/η is the detection-added noise referring to the channel input, where η is the efficiency of Bob's homodyne detector and v el is the detector's electronic noise.
χ line = (1/T) − 1 + ξ is the channel-added noise referring to the channel input, where ξ is the excess noise of the system, and its value does not include Bob's internal noise.
Based on the above equation, we are able to obtain the security key rate for the asymptotic case Equation (1) under the collective attack. Besides, the security in the finite key case based on the uncertainty principle and the composable security were proven against collective attacks [33,34].

Quantum Hacking Attacks and Countermeasures with Machine Learning
Studies focused on the practical vulnerabilities in CVQKD systems, which can start from two perspectives. On the one hand, legitimate distant parties (Alice and Bob) want to securely communicate in the presence of Eve. Once having perceived the type of certain attack, they must interrupt or defend accordingly. On the other hand, eavesdroppers (Eve) want to steal the fractional or intact key information without being detected by Alice and Bob. The application of machine learning is both a severe challenge and a golden opportunity from two perspectives. The following provides the recent works.

Countermeasures on a Targeted Attack
For example, consider the wavelength attack. The linear discriminant analysis support vector machine (LDA-SVM) algorithm was applied to successfully detect the wavelength attack via analyzing optical spectrum signals in practical CVQKD systems by He et al. [35].
Targeting the wavelength-independent properties of the beam-splitter (BS), Eve launches the wavelength attack by switching LO signals and different wavelengths sent from Alice, which is unsuspected by Alice and Bob [9]. To prevent wavelength attacks, one of the known countermeasures is to randomly add a wavelength filter and monitor the LO intensity in the practical CV system. A more general solution is to perform the real-time shot noise measurement [36]. As we mentioned above, this kind of patched countermeasure increases the complexity of the system.
The traditional machine learning model is divided into two parts, namely feature extraction and classification, which correspond here to LDA and the SVM [37]. SVMs have gained prominence in the field of data classification and are constantly evolving [38][39][40]. These methods seek to find an optimal hyperplane that maximizes the Euclidean distance from the hyperplane to the support vectors. If its generalization lies on nonlinear hypersurfaces, SVMs are combined with different kernel functions such as the linear, polynomial, sigmoid, and radial basis functions [41]. LDA maps the digital spectrum data with labels in a high-dimensional space onto a low-dimensional space, which maximizes the betweenclass scatter and minimizes the within-class scatter. The preprocessed features are used as the input to train and test the SVM classifier. Therefore, He et al. [35] proposed an intelligent monitoring model with the LDA-SVM algorithm embedded in it based on an optical spectrum analyzer (OSA). They collected normal signal spectra and the forged signal spectra of the wavelength attack by Eve as the dataset. Optical spectrum signals were transformed into digital optical spectrum data by OSA. Figure 1 shows the procedure of the algorithm's processing module. The dataset was divided into the training set and the testing set and input into the LDA algorithm module to extract the features. The SVM detector was trained and evaluated on these specially selected features. Having superior performance, the optical spectrum intelligent monitoring model can be deployed in a common communication environment. This intelligent model can identify abnormal optical spectrum signals in the center wavelength of the signals of 1528.0 nm, 1540.5 nm, and 1548.5 nm and even 1549.1 nm and 1550.1 nm, for safe communication in CVQKD systems (the recognition accuracy is 100%). In theory, except for the center wavelength of the signals, the model based on the LDA-SVM can deal with signal intensities, peak values, and other important indexes. The other typical attack is the calibration attack. A hidden Markov-model (HMM) was utilized to detect the calibration attack in real CV systems by Mao et al. [42].
The calibration attack is a powerful attack that arises from the loopholes of the LO intensity calibration and the clock generation processing in the practical CVQKD setup [11]. In calibration attacks, Eve intercepts a fraction of the signal pulses during quantum transmission when launching a partial intercept-resend attack [43]. Then, Eve prepares the perfect shape of quantum states and resends this to Bob, while Bob's shot noise estimation remains unchanged. This allows Eve to control the shot noise estimated by two parties. One of the presented strategies to solve this is to add the second homodyne detector on a split part of Bob's LO to monitor the real-time shot noise [36].
As a typical machine-learning model for classification, the HMM is a reasonable model for describing the transient processing and dynamic properties of the problems, which are the building blocks of detecting anomaly intrusions in time series [44][45][46]. Its targeted problems have two characteristics: one is based on computational sequence analysis; the other is that there exist two kinds of data, namely observed sequences (quadrature values measured by Bob) and the underlying state path (interference factors).
After analysis, Mao et al. [42] found that the variation of the measured quadrature values reflected whether the calibration attack was performed. However, Bob's measurement values are affected by environmental disturbances, the drawbacks of devices, and the attacks by eavesdroppers. Based on this, Mao et al. established an HMM-based calibration attack recognition, which sufficiently analyzes training data only influenced by common interferences but eavesdroppers. If Eve launches the calibration attack, the trained model will detect the interference values according to the changed quadrature values of Bob's measurement. The whole procedure has two parts, as illustrated in Figure 2. The offline training process is the top-half part above the dotted line, while the recognition process is the bottom-half part below the dotted line. During the training, Mao et al. collected the normal dataset of previous communication processing by the peak-valley-seeking method and then trained the parameters of the HMM with the Baum-Welch algorithm. Normally, a well-trained HMM will output a high probability value for normal data and a low probability value for attacked data. When the predicted probability value is smaller than a certain threshold, the system is attacked and the received attacked sequences are discarded. The HMM-based calibration attack recognition can precisely detect almost all of the attacked data under 30 km with a high recognition precision of 98.735%. According to this idea, if the unattacked training data under the condition of a certain attack can be collected, this proposed model also applies to the analysis of other attacks.

Countermeasures on Multiple Attacks
The above-mentioned recognition models only detect a certain given attack, but which kinds of attacks were launched by Eve are not predictable. Nonetheless, it must be emphasized that we need a universal defense scheme to detect multiple attacks as much as possible. For this purpose, an artificial-neural-network (ANN)-based universal defense scheme for CVQKD systems was proposed by Mao et al. [47].
In [47], multiple attacks involving three typical attack strategies against CV systems with imperfections of the homodyne detector, the calibration attack, the LO intensity attack, and the saturation attack, were considered, as well as two hybrid attacks [13,48]. Mao et al. further investigated some classical features of the pulses and deviations of these features between normal unattacked pulses and abnormal attacked pulses. The results indicated that there were four features influenced by different attack strategies, called the LO intensity I LO , the shot noise variance N 0 , and the meanȳ and variance V y of Bob's measurement. Table 1 shows the impacts of multiple attacks on the four features. The top four attack strategies affect different features, and there are different levels of impact on the same features between the hybrid attack and the saturation attack. The ANN is an information-processing model that imitates the biological nervous system function of the human brain [49]. The ANN architecture is connected through several layers, and each layer contains a certain amount of neurons. The three-layer nonlinear ANN multiclassifier designed in this paper consisted of the input, output, and hidden layers, using a soft-max function to properly distinguish multiple attacks.
The ANN model for attack detection was trained and tested as depicted in Figure 3 from [47]. Bob's received keys were used as the input and put into the model in order. Once having perceiving abnormal data, the transmitted processing terminates immediately with both time and resource efficiency. The precision and recall of these multiple attacks reached the maximum of one when the number of neurons in the hidden layer was fifteen. Besides, the security analysis of a CVQKD system that employed the ANN-based attack detection model was performed and compared with a system without any countermeasures against attacks. In both the asymptotic and finite size cases, the secret key rate and transmission distance of the proposed model decreased, but the overall defense capability of the system was enhanced, as shown in Figure 4a. The composable secret key rates of this ANN model were less than those in the asymptotic and finite size limit, but gradually increased as the number of exchanged signals increased, as shown in Figure 4b. All in all, compared with the CV systems without detecting strategies, the common ANN defense scheme constructed an integral defense model against most known attacks and obviously improved the systems' security, but at the small expense of the key rate and transmit distance.

Quantum Hacking with Machine Learning
In Section 3.1, we introduced several cases of how to exploit machine-learning techniques to perceive one certain or several typical attacks, thereby restoring the high-accuracy and robust defense performance in CVQKD systems. We now discuss the machine-learning application to enhance the success rate of a quantum hacking attempt. Huang et al. [50] demonstrated a convolutional neural network (CNN)-based entanglement distillation attack in the horizontal link GMCS-CVQKD system, in which the CNN can help Eve choose the best opportunity to launch the entanglement distillation attack.
In the theoretical analysis of the security, the lossy quantum channel has a transmission efficiency with constant attenuation and excess noise. However, in practical free space CVQKD systems, a security problem is the instability of the signals' physical parameters, especially in the weak and strong-turbulence free space [51]. The transmission efficiency fluctuates according to time, resulting in transmitted states degrading with a certain probability to non-Gaussian mixed states [52]. Eve can perform the entanglement distillation attack on the transmitted non-Gaussian mixed states.
The CNN, a deep learning algorithm that has quickly been developing, has performed various complex tasks especially in image recognition [53,54]. The designed CNN architecture combines five components: an input layer, two convolutional layers, two max-pooling layers, a fully connected layer, and a soft-max output layer. It achieves feature extraction hierarchically and indicates the classes of the input variables. Figure 5 displays the CNN-based entanglement distillation attack model from [50]. Eve extracts part of the light beam B1 and measures one of the quadrature values. The well-trained CNN model takes the measurements as the input, and determines whether transmitted states are a non-Gaussian mixture to be used as the outputs. If they are, Eve will launch the attack; otherwise, she terminates the attach (the classification accuracy was 97.8%). Compared to traditional methods based on statistical analysis, the time complexity and time loss of this model are less influenced by the data size and its simpler implementation in existing technologies. To verify the practicability of the system, Huang et al.
performed a security analysis showing that the region bounds of security were significantly impacted by the tap beam splitter transmissivity T e and the threshold value of for discarding the remaining state x th , as shown in Figure 6. The shorter the transmit distance, the more reliable the system is during the entanglement distillation attack. If some parameters, such as T e and x th , are adjusted to a proper random value, Eve can gain a non-negligible amount information about the final secret key without Alice's and Bob's realization.

Conclusions
In this review, we discussed how to perceive the imperfections of devices with machine-learning techniques in practical CVQKD systems. These range from countermeasures to quantum hacking attacks. Firstly, we briefly described a classical distribution protocol flow based on the GG02 protocol and an example of the security proof: the asymptotic case against collective attacks. Secondly, after analyzing the abnormal behaviors of Eve, we reviewed the countermeasures on a certain attack and multiple attacks with several classical machine learning models. These recognition models can effectively identify and classify attacks with high precision and recall values. The application of machine learning is at the software level and does not require any additional equipment, addressing the vulnerabilities and system burden associated with traditional patched countermeasures. Later, we introduced how to improve the success rate of quantum hacking attacks with machine learning in order to reach the particular required conditions. We emphasized that the purpose of the analysis of quantum hacking is not to disrupt the communication process, but to prevent bugs and loopholes in the future in the implementation of QKD systems. Our aim is to use machine learning to reduce the complexity of CV systems to a certain degree and improve the performance to ensure the security of quantum secure communication.
As mentioned above in Section 3, the proposed machine-learning models can be used to perceive other vulnerabilities under certain conditions. How to achieve these conditions, e.g., how to collect training data under a certain attack and how to find the necessary loopholes to launch an attack, is crucial. Besides, the deployment of CVQKD systems with machine-learning models in the real world also deserves thoughtful consideration. Are these methods effective, and how should they be strengthened if faced with stronger attacks from Eve? All of the above are issues that need to be considered in future research, and many unexpected prospects may emerge. It is also quite certain that machine learning will have a huge advantage in relevant data processing and data analysis in the attack and defense of practical CV systems.
Similarly, there are practical security issues such as those in DVQKD and several corresponding classic countermeasures [17]. Attention was also given to the fact that machine-learning approaches can help tackle difficulties such as parameter estimation in DV systems, such as developing a new operating mode called "predicting-and-updating" with a long short-term memory network to handle the phase drift problem [55]. Given this, why not apply machine-learning techniques to real-world security issues in practical DVQKD? We are excited about what the future holds.
All in all, machine learning has gradually played an important role in the quantum safe encryption transmission for real CV systems, paving the way for secure QKD with realistic devices. Meanwhile, we expect that in the future, the study of machine learning will continue to lead to many new unexpected insights in the subfields of QKD.