A Phase Fluctuation Based Practical Quantum Random Number Generator Scheme with Delay-Free Structure

: Quantum random number generators are widely used in many applications, ranging from sampling and simulation, fundamental science to cryptography, such as a quantum key distribution system. Among all the previous works, quantum noise from phase ﬂuctuation of laser diodes is one of the most commonly used random source in the quantum random number generation, and many practical schemes based on phase noise with compact systems have been proposed so far. Here, we proposed a new structure of phase noise scheme, utilizing the phase ﬂuctuation from two laser diodes with a slight difference of center wavelength. By analyzing the frequency components and adopting an appropriate band-pass ﬁlter, we prove that our scheme extracts quantum noise and ﬁltered other classical noises substantially. Results of a randomness test shows that the extracted random sequences are of good performance. Due to lack of delay-line and the low requirement on other devices in this system, our scheme is promising in future scenarios for miniaturized quantum random number generation systems.


Introduction
Random numbers are of great significance in many fields, including statistical analysis, numerical simulation, fundamental science [1] and cryptography [2]. The randomness of a random number generator (RNG) could seriously affect the applications of high security demands, one of the outstanding example is the quantum key distribution (QKD) system in quantum cryptography [3][4][5][6]. Based on the determined algorithm, a classical random number generator, also called a pseudo-RNG, could generate pseudo random sequences in an efficient method, expanding the randomness from short random seeds with extremely fast speed, and compatible with portable devices. However, due to the deterministic and predictable intrinsic nature of computational algorithms, pseudo random number generators face severe security issues in applications such as secure communication systems.
By adopting a certain post-processing method, practical QRNG scheme is information theoretically proved secure under the trusted-device scenario, which is adequent for the security requirements of most applications.
Traditional phase noise schemes utilize a self-delayed interference structure, thus inevitably introducing a delay-line in an unbalanced Mach-Zender interferometer (MZI) in the experimental setup [31,32,38,39]. However, due to the bandwidth of phase noise in the laser, it usually takes several meters for the delay-line to reduce the auto-correlation coefficient of raw data effectively. In practical devices, space for the delay-line is one of the most difficult parts to be compressed, unless this part can be totally removed by adopting a novel scheme.
In this paper, we demonstrate a QRNG scheme based on phase noise with a delay-free structure. We theoretically analyze our scheme, and point out three main frequency components in the electric signal to be measured. After adopting appropriate implementation settings, namely center wavelength of laser diodes and passband of a band-pass filter, we distill phase noise from the original signal for further randomness extraction. Two extraction methods, the m-Least significant bit (m-LSB) method [51] and universal (Toeplitz) hashing method [34,36,44,52] are used in post-processing phase. Our scheme could achieve a generation rate of 600 Mbps, which is six times the sampling rate, and passes widely used randomness test batteries. Compared with previous schemes utilizing phase noise as a random source, our scheme is delay-line free, while the devices are conventional and performance of QRNG remains similar. Thus, it has great potential in compact and portable QRNG devices in the future.
The structure of this article is described as follows. In Section 2, the schematic setup of our scheme is demonstrated, followed by analysis on different noises in the system and the principle to distill the quantum noise of phase fluctuation. An experimental implementation is built according to the scheme. Two post-processing methods are realized, namely the m-LSB method and universal (Toeplitz) hashing method. Section 3 shows the various results of randomness test batteries.

Principle of Scheme
The schematic setup of our scheme is shown in Figure 1. Two laser diodes with very close center wavelength are used as random source. The intensity of lasers are carefully tuned by variable optical attenuator (VOA) to the exact same level, then optical signals of two lasers are injected into a 50:50 beam splitter (BS). Electric fields at the input ports (P1, P2) of the BS could be written as: Since the amplitude of laser is tunable and could be very stable during the experiment, the amplitude part of field could be regarded as a constant in our scheme, which means E i (t) = E i , i = 1, 2. Two signals are interfered at the beam splitter, thus field at the output ports (P3, P4) of the BS could be written as: where T refers to the transmittance, thus R = 1 − T refers to the reflectivity of the beam splitter. Signals of port P3 and P4 take t 1 , t 2 time before arriving at the detectors (D1, D2) respectively. Hence, the intensity of electric signal at detector D1 (which comes from port P3) is: while the electric signal at detector D2 (which comes from port P4) has a similar expression: Apparently I D1 (t) includes both DC term and AC term: According to (5), the DC term is TE 2 1 + (1 − T)E 2 2 , and AC term is the rest. Therefore, if we use AC coupling photo-detectors in our scheme, we can only keep the AC terms in the following analysis and processing. After being detected by photo-detectors, the optical signal turns into an electric signal and the intensity I D1 (t) is converted into optoelectric current i D1 (t) proportionally: Finally, the voltage signals V D1 (t), V D2 (t) are combined in a mixer by frequency mixing (only the real part is considered here), and the new signal is denoted as V mix : It is clear that from (10), the final signal includes several frequency components, in which three of them are dominating. Two of them are from the phase term ∆ϕ(t), where ∆ϕ fiber is due to the fiber jitter, and ∆ϕ phase comes from phase noise of laser diode, from which we expect to extract randomness. Another part of frequency term 2∆ωt is due to the difference of center wavelength of two laser diodes. In fact, there exist a crucial relationship between these three frequency components in our scheme: and all the implementation settings in our experimental setup is based on (11).

Experimental Setup
We experimentally realized our scheme as Figure 1 shows above, and the setup is described as follows. Two distributed feedback (DFB) laser diodes emit continuous wave (CW) light with center wavelength at around 1550 nm. Specifically, the center wavelength of two DFB laser diodes are originally set at 1549.865 nm and 1549.858 nm respectively, which is 880 MHz separated from each other. The laser output power is set slightly higher than the threshold to obtain highest proportion of quantum noise [31,34,36], which is 1.3 mW in our setup. Intensity of signals from two laser diodes are carefully tuned by variable optical attenuator (VOA) to keep the intensity equal. After the signals interfere at a 50:50 beam splitter (BS), optical signals are detected by two homemade photo-detectors with AC coupling (measurement bandwidth 100 MHz). Electric signals are mixed by a frequency mixer, then the signal is filtered by a band-pass filter with 10-1000 MHz passband range to select the frequency component of phase noise, which is the appropriate frequency range in our scheme decided by the implementation. The electric signal after the filter is sampled by an analog-to-digital converter (ADC, ADS5400, sampling frequency 100 MHz, sampling precision 12 bits and input voltage range 1.5 V peak-to-peak). Finally, a field programmable gate array (FPGA, KC705 evaluation board) is adopted to realize randomness extraction and data precision adjustment. The sampling range is 1.5 V, however the peak-to-peak value of the noise signal is only 190 mV, hence there is approximately three unoccupied bits in raw data, which should be eliminated at the beginning in post-processing.
As mentioned above, we set the difference of center wavelength at around 880 MHz (0.007 nm at 1550 nm wavelength), which is higher than the noise bandwidth from phase fluctuation of laser diodes in our implementation. Noticing that, the difference of center wavelength should not be too large, since the waveform after interfering at the beam splitter may not able to keep stability for heavily mismatched laser diode wavelength, as shown in Figure 2. On the other hand, frequency of fiber jitter noise is usually no greater than 10 MHz. By utilizing a band-pass filter with a 10-1000 MHz passband range at the output port of the mixer, one can substantially eliminate the influence of the fiber jitter term ∆ϕ fiber , as well as the difference of center wavelength term 2∆ωt. Hence the term of phase noise is distilled and used for further randomness extraction process.

Post-Processing Method
The raw data, measured from the phase noise of laser diode, is approximately a Gaussian variable on the probability distribution function (PDF) [53,54]. However, random sequences of general QRNG schemes and applications should be a uniform distribution, hence post-processing is essential. Another function of post-processing is eliminating the unexpected randomness from the environment, which may be utilized by the adversary Eve, specifically the classical noise. Therefore, generally speaking, a traditional post-processing method includes two phases: entropy estimation and randomness extraction. The entropy estimation phase calculate the upper bound of randomness, which can be extracted from the raw data as a discretized random variable X dis , based on the observable parameters. Then, one can adopt various extractors to distill the randomness calculated before in the randomness extraction phase. There exist two efficient methods in QRNG post-processing, based on different devices in implementation and application requirements, namely the m-least significant bit (m-LSB) method and universal (Toeplitz) hashing method. a b Figure 2. The waveform at photo-detectors in time domain, recorded and shown by oscilloscope (DSAX-91604A, Agilent). Two signals below are the input signal at the frequency mixer, and the signal above with smaller scale is the output signal of mixer. Pictured in (a) is the stable waveform with appropriate center wavelength selection. The waveform is very stable during the test, and sine shape phase noise signal is distilled by the frequency mixer and band-pass filter. Pictured in (b) is the incorrect waveform when the difference of center wavelength is too large to maintain a regular shape, hence the signal after mixer is also very unstable.
The former one, m-LSB method, belongs to the deterministic extractor. The m-LSB method is extremely simple to implement both on hardware and software, and could run at very high sampling and generation rate. One just truncates the raw data, takes the last m-bit random numbers and outputs the final sequence (logical exclusive OR (XOR) operation is also optional if necessary). The reason for taking the LSB instead of its counterpart, the most significant bit (MSB), is that the LSB has a better distribution and lower auto-correlation coefficient after post-processing, and thus is more difficult to be predicted by the adversary. This method is quite effective if the implementation is trusted and has a relatively high quantum-to-classical noise ratio (QCNR). However, in untrusted device scenarios, one can still extract several bits with high sampling resolution.
We adopt a m-LSB method by treating our implementation as a trusted-device scenario and according to the analysis in [51]. It is secure for m-LSB to truncate four bits out of a 16-bit discretized signal in noise-free cases, and secure to truncate five (seven) bits out of 16 bits with the deviation of classical noise σ E three (four) times larger than quantum noise σ Q . In fact, quantum noise is dominate in our implementation, thus truncate a moderate number of six bits from raw data (including the unoccupied three most significant bits) to form the extracted sequence is adequately conservative for our scheme. Therefore, the generation rate of adopting m-LSB method is 100 × 6 = 600 Mb/s. Universal hashing method is another post-processing method often chosen in QRNG schemes, which belongs to the seeded extractor, indicating that this method should consume some short random seed to generate the universal hashing functions. Among these functions, the Toeplitz matrix is an outstanding solution for its low complexity in computation and implementation. For a binary Toeplitz matrix utilized for QRNG post-processing, the size of the original matrix is M × N, where N is the size of raw data, and M is the size of extracted sequence. The ratio M/N is a crucial parameter which is closely related to the min-entropy H ∞ (X) calculated in the entropy estimation phase: The output sequence of Toeplitz hashing, based on the input, is almost unique, for it has a collision probability of only 2 −M+1 N for a different input to share the same output. According to information theory, the security parameter ε should satisfy: (13) where ε also indicated the distance between the output sequence and ideal uniform sequence. Therefore, by designing a Toeplitz matrix with ratio M/N slightly smaller than the min-entropy and adopting different matrix size, the security parameter ε could be arbitrarily close to zero. Noticing that, the seeds consumed in generating Toeplitz matrix is N + M − 1. Since the data size is huge in QRNG systems, post-processing should be run by block: N = Bn, M = Bm, where m × n is the size of practical Toeplitz matrix, and B is the number of blocks. After discarding the unoccupied bits in raw data, the min-entropy in our system is 6.60 bits/sample, hence we set our Toeplitz matrix at a moderate size of 1536 × 3072 and run the post-processing method by block, which means 3072 ÷ 12 = 256 consecutive samples are collected and process in one Toeplitz hashing operation. The security parameter is ε = 7.6 × 10 −24 in our implementation. Since the extracting ratio decided by the Toeplitz matrix is also 50%, the generation rate of adopting Toeplitz hashing method is also 600 Mb/s.

Test Results
The test is divided into two parts, including 3σ criterion and widely used test batteries such as DIEHARD or NIST-STS. Uniformity are also included in the batteries. Firstly, data are randomly chosen to perform the 3σ test to compare the difference between raw data and extracted numbers. For raw data, the sample size equals the sampling depth of 50 M points/samples with 12-bit resolution in our implementation. While for extracted sequence, the sample size is 420 Mbits, which is consecutive data in 0.7 s after post-processing. The results are shown in Figures 3 and 4. Apparently post-processing methods impressively reduce the low-order auto-correlation coefficient, due to the limited bandwidth of devices may cause some correlation in the adjacent samples of raw data, particularly in oversampled scenarios, where sampling rate and auto-correlation trade-off should be carefully dealed with in practical devices.  Figure 4. Auto-correlation coefficient a k of extracted sequences. Pictured in (a) is auto-correlation with different self-delay value k, ranging from 1 to 300 bits. Pictured in (b) is a first-order auto-correlation coefficient a 1 with different sequence length, ranging from 1 M bit to 420 M bits. The dashed line indicates reference calculated by the 3σ criterion. In contrast to the test of raw data, the parameter shows no bias and does not significantly exceed the reference threshold.
In order to evaluate randomness, we also adopt widely-accepted randomness test batteries, namely DIEHARD and NIST-STS test, both of which are hypothesis tests of a statistically based randomness test, with a couple of p-values indicating whether to accept or reject the hypothesis in each sub-test. Generally speaking, if all the final p-values located between [0.01, 0.99] (with default significance level α = 0.01), the whole test is considered successful. Random sequences extracted from either the m-LSB or Toeplitz hashing method should pass both test batteries, and we choose one typical sequence from the Toeplitz hashing method as an example: the result of DIEHARD test is shown in Table 1, and result of NIST-STS test is shown in Table 2.

Conclusions
We proposed and experimentally realized a QRNG scheme utilizing quantum noise from phase fluctuation of laser diode with a novel structure. Optical signals from two laser diodes with very close center wavelength interfere at a beam splitter before detected by AC coupling photo-detectors. Electric signals from two detectors did frequency mixing with a mixer. After analyzing the frequency components, we pointed out there are three dominating frequency terms in the noise: difference of the center wavelength of laser, phase fluctuation of laser, and fiber jitter of the system. Due to the different frequency range between these components, we found it possible to substantially eliminate the unexpected terms by a well selected band-pass filter, before extracting randomness with the phase noise term. We use two conventional post-processing methods, the m-least significant bit method (m-LSB) and universal (Toeplitz) hashing method, to distill randomness from electric signal, and realize a generation rate of 600 Mb/s on hardware, which is six times higher than the sampling rate.
Our scheme has three major merits. Firstly, the structure of our scheme is delay-line free, which means the space for delay-line in practical system could be removed. Secondly, the requirement for laser diode is not so strict. One only need to make sure the center wavelength of lasers are close enough, and work with a power stabilization module, instead of a possible frequency stabilization module. Thirdly, the post-processing methods, either m-LSB or universal hashing method, can be realized on hardware in real-time. These merits make our scheme highly potential as a compact QRNG system.
We should admit that the generation rate in our scheme is relatively low in contrast to current schemes. However, this is due to building our scheme in a very conservative way and this work is just a demonstration. The bandwidth of photo-detectors and the sampling rate are both set at 100 MHz, where devices used in major schemes of phase noise are one or two orders of magnitude higher than this value. Since the generation rate is mainly limited by the detectors, it still have huge space to improve. In fact, our colleagues have realized balanced detectors with bandwidth over 1 GHz [55]. By utilizing this technique, the generation rate of our scheme is highly potential to be increased to 6 Gbps, which is the level the other QRNG scheme based on vacuum fluctuation proposed before [44]. Furthermore, the corresponding post-processing method can also be more efficient, including carefully setting the driven current of laser diode to achieve a higher quantum-to-classical noise ratio, which could lead to an even tighter upper bound of min-entropy in Toeplitz hashing method.

Conflicts of Interest:
The authors declare no conflict of interest.

Abbreviations
The following abbreviations are used in this manuscript:

QRNG
Quantum random number generator QKD Quantum Key Distribution ADC Analog-to-Digital Converter MSB/LSB Most/Least Significant Bit