Provably Secure Symmetric Private Information Retrieval with Quantum Cryptography

Private information retrieval (PIR) is a database query protocol that provides user privacy in that the user can learn a particular entry of the database of his interest but his query would be hidden from the data centre. Symmetric private information retrieval (SPIR) takes PIR further by additionally offering database privacy, where the user cannot learn any additional entries of the database. Unconditionally secure SPIR solutions with multiple databases are known classically, but are unrealistic because they require long shared secret keys between the parties for secure communication and shared randomness in the protocol. Here, we propose using quantum key distribution (QKD) instead for a practical implementation, which can realise both the secure communication and shared randomness requirements. We prove that QKD maintains the security of the SPIR protocol and that it is also secure against any external eavesdropper. We also show how such a classical-quantum system could be implemented practically, using the example of a two-database SPIR protocol with keys generated by measurement device-independent QKD. Through key rate calculations, we show that such an implementation is feasible at the metropolitan level with current QKD technology.


Introduction
With the rising concern of personal data privacy, users of digital services may not want their preferences or selections to be revealed to service providers. This can be achieved with private information retrieval (PIR), where users can access specific entries of a database held by the service provider at a data centre without revealing his or her entry selection [1]. This cryptographic technique has found application in areas such as anonymous communication [2] and protecting user location privacy in location-based services [3].
However, in some occasions, the service provider or data centre may not want to reveal more information about the database than what is necessary, i.e., than what should have been given to the user. Such a setting is common in pay-per-access platforms such as iTunes and Google Play, or in more sensitive environments where the service provider has to secure the information of other database entries, like in the case for medical records retrieval and biometrics authentication [4]. To provide for this additional security requirement (i.e., database privacy), one may employ symmetric private information retrieval (SPIR), a sort of two-way secure retrieval scheme first introduced by Gertner et al. [5].
In the literature, both PIR and SPIR have been extensively studied in the case where the user only communicates with one data centre. Here in the former, unconditional security (or information-theoretic security) can only be achieved by communicating the entire database from the data centre to the user. This implies that information-theoretic single database SPIR is not achievable [1]. To overcome this impasse, researchers have looked to weaker security frameworks, for instance, those based on computational security [6][7][8][9].
On the quantum front, there is also a similar conclusion for single database SPIR [10], i.e., it is not possible to achieve information-theoretic security even in the quantum setting. In light of these negative results, protocols for SPIR have largely evolved to cheat-sensitive protocols, also known as quantum private query [11]. Examples of these protocols include those based on quantum oblivious key distribution [12][13][14][15][16], those based on sending states to a database oracle [17,18], and those based on round-robin QKD protocol [19]. In these protocols, the parties are averse to being caught cheating, so cheat-detection strategies allows one to construct protocols with more relaxed conditions as compared to those of SPIR [20]. However, parties can stand to gain information by cheating in these protocols and thus the protocols would not satisfy the original security requirements of SPIR proposed by Ref. [5]. Other attempts at avoiding the no-go results include using special relativity [21,22].
One way to achieve information-theoretic security for SPIR is to communicate with multiple data centres, each of which holds a copy of the database. In fact, in their seminal work, Gertner et al. introduced a k-database classical SPIR protocol that is informationtheoretically secure, with the assumption that the data centres cannot communicate (during and after the protocol), and how one can build these from k-database PIR protocols [5]. Since then, researchers have studied multi-database SPIR in the context of compromised and byzantine data centres [23]. With multiple databases, the communication complexity of PIR and SPIR protocol can also be reduced to O(n 1 2k−1 ) based on Gertner's original protocol [5], and even further to O(n 10 −7 ) by Yekhanin [24], where n is the number of entries in the database. There have also been several studies on the quantum version of multi-database SPIR. Kerenidis et al. focuses on how SPIR can be performed without shared randomness if the user is honest [25]. Song et al. proposed a quantum multi-database SPIR, but requires shared entanglement between the data centres and assumes secure classical and quantum channels [26].
The classical multi-database SPIR protocols proposed require secure channels, which are not achievable without some pre-shared secret keys between the parties in the protocol. In principle, the secret keys should be as long as the messages to be exchanged, but this would be costly and impractical for applications that work with large databases or require multiple uses of the SPIR protocol, e.g., medical records query where each doctor has to query for the files of multiple patients. In practice, the standard approach is to use public-key cryptography (e.g., using the Diffie-Hellman key distribution protocol [27]) to expand the initial pre-shared secret key to a longer key. However, taking this approach could be risky, for it has been demonstrated that most known key distribution schemes based on public-key cryptography are insecure against quantum computing based attacks (an emerging technology). Evidently, this can be a huge problem for applications which require long-term security, like in the case of electronic health records which typically requires decades of information confidentiality.
Quantum key distribution (QKD), a relatively mature technology with already multiple companies selling commercial QKD devices, offers a solid and promising solution to the above as it provides an information-theoretic method to expand pre-shared secret keys [28,29]. As such, the expanded keys can withstand the threats of quantum computing based attacks, and any other yet-to-be-discovered algorithmic advancements. Moreover, the expansion of keys need not be performed in real-time, i.e., expanded keys can be used for future SPIR runs. It is important to emphasise here that QKD cannot lead to a perfectly secure channel in practice, for it involves some statistical and entropy estimation procedures which carry overhead penalties in the security. Fortunately, these penalties can be made arbitrarily small with a proper security analysis, and subsequently the resulting secure channel can be made arbitrarily close to a perfect one. It is the goal of our work to incorporate these technical subtleties into the original security definition of SPIR so that we can add QKD as a supporting base layer. Here, we see the QKD layer as one that provides the necessary secret keys and secure channels (using one-time pad encryption) for SPIR. We note that Quantum Secure Direct Communication, which transmits messages directly using quantum states, could also serve as a secure communication channel [30][31][32]. Schematic of a quantum key distribution (QKD) network with star topology, which can supply QKD keys for the symmetric private information retrieval (SPIR) protocol. The central node (hub) connects to the user and two data centres with optical fibre (solid lines). Using the physical connection, any two parties in the protocol can establish a secure QKD link (dotted lines) via the central node.
In this work, we describe how QKD can be used to relax the requirement of perfectly secure channels in classical multi-database SPIR, and examine the resources required for such a protocol. In Section 2, we introduce the mathematical notations required to describe the protocol and security analysis. In Section 3, we introduce the basic elements of a generic SPIR protocol and the original SPIR security definition. In Section 4, we introduce QKD channels and its security definitions, generalise the SPIR definition to a quantum one, and show how QKD can be incorporated into SPIR as the communication channel. In Section 5, we prove the security for a multi-database SPIR protocol with QKD channels based on the revised SPIR definitions. In Section 6, we introduce MDI-QKD and perform numerical analysis to determine the resources required for MDI-QKD to obtain the desired SPIR protocol.

Quantum and Classical Systems
The state of a generic quantum system living in Hilbert space A is represented by a density operator ρ A , a positive semi-definite matrix with trace one. Classical systems are modelled by quantum systems whose state is diagonal in a given orthonormal basis. For a random variable Y that takes on values y ∈ Y with probability P Y (y) = Pr[Y = y], the corresponding state of the classical random variable is where {|y } y∈Y forms an orthonormal basis. To keep the above notation compact for multiple variables, we will sometimes use Π XYZ (xyz) to represent the tensor product of classical states, i.e., |x x| ⊗ |y y| ⊗ |z z|.
A bipartite system on YA is called classical-quantum if its state admits the form where ρ y A is the state of A conditioned on the event Y = y.

Trace Distance and Distinguishability
To measure the distinguishability of two quantum systems, we use the trace distance measure, which for any two states ρ and σ, is defined as where ρ − σ 1 is the trace norm of ρ − σ. Notice that the trace distance is bounded between 0 and 1, with identical states giving 0 and completely orthogonal states giving 1. With this, two systems are said to be ε-close if their states, ρ and σ, satisfy ∆(ρ, σ) ≤ . The trace distance measure admits a few properties: (1) it satisfies triangle inequality, i.e., for any ρ, σ, and τ, it satisfies ∆(ρ, σ) ≤ ∆(ρ, τ) + ∆(τ, σ), (2) it is jointly convex in its inputs, i.e., ∆( (3) it is non-increasing under completely positive and trace preserving (CPTP) maps E , i.e., ∆(E (ρ), E (σ)) ≤ ∆(ρ, σ). For classical random variables Y 1 and Y 2 that takes on values y ∈ Y with probability distribution P Y 1 and P Y 2 , the trace distance of their probability distributions reduces to the classical definition, If the random variables Y 1 and Y 2 correspond to the measurement outcome when performing a POVM measurement {Γ y } y∈Y on states ρ and σ respectively, the trace distance of the probability distribution of Y 1 and Y 2 would be upper bounded by the trace distance of the original quantum states [38], i.e.,

Generic One-Round SPIR Protocol
In this section, we introduce some additional notations and the essential elements of a generic SPIR protocol. A multi-database SPIR protocol has a user U, who interacts with k ≥ 2 data centres D j , j ∈ {1, . . . , k}, each having a copy of the database, represented by W with n entries. For simplicity, we focus on databases with single bit entries, i.e., W = (W 1 , W 2 , . . . , W n ) ∈ {0, 1} n ; our analysis can be easily extended to multi-bit entries.
We also assume that all parties are equipped with a secure random number generator, which they may use for cryptography purposes. For our analysis, we denote the user's local randomness by R.
Here, we focus on one-round SPIR protocols, where there is only one round of query from the user to the data centres, and a single round of reply from the data centres to the user. In the case of multi-round SPIR protocols, there can be multiple successive rounds of queries and answers. A one-round SPIR protocol for two data centres can thus be defined by a pair of query functions, f query,1 and f query,2 , to generate the user queries for data centre 1 and data centre 2, respectively, answer functions f ans,1 and f ans,2 for the data centres to generate their responses to the queries received, and the decoding function f dec for the user to retrieve the desired database entry, W X . These are functions of random variables and hence their outputs are random variables as well.
A generic one-round two-database SPIR protocol typically performs the following steps (summarised in Table 1) for a given input X = x and database W = w: 1. Establishing secure channels: Using pre-established secret keys, perfectly secure channels are established between the user and data centres using one-time pad (OTP) encryption. We use (K 1 , K 2 ), (K 3 , K 4 ), (K 5 , K 6 ) to represent the secret key pair between data centre 1 and user, between data centre 2 and user, and between the data centres, respectively. For example, with this arrangement, the user holds K 2 and K 4 and data centre 1 holds K 1 and K 5 . Secure channels connecting the user and data centres are denoted by C U1 and C U2 , respectively. Note that the data centres are not allowed to communicate and hence we do not need to define any channel for them. To allow for two-way secure communication with a single secret key, we split K = (K enc , K dec ) into two halves, namely K enc (for encryption) and K dec (for decryption). 2. Query: The user generates queries for data centres 1 and 2, with Q 1 = f query,1 (x, R) and Q 2 = f query,2 (x, R), respectively, and sends them to the data centres using the secure channels C U1 and C U2 . 3. Answer: Upon receiving the queryQ 1 (which could be different from Q 1 ), D 1 (resp. D 2 ) determines a reply A 1 = f ans,1 (Q 1 , w, K 5 ) (resp. A 2 = f ans,2 (Q 2 , w, K 6 ) and sends it to the user via the secure channels. 4. Retrieval: The user retrieves the desired database entry value usingŵ x = f dec (Ã 1 ,Ã 2 , SPIR is designed to resolve situations where the user or data centres deviate from their expected (honest) behaviour. For instance, a dishonest user could communicate bad queries in an attempt to learn additional entries in w, and dishonest data centres could provide replies other than the expected answer A j to learn about x. That is, a dishonest user can replace Q j in step 2 of the protocol by an adversarial queryQ j , and dishonest data centres can provide adversarial answersĀ j in step 3 of the protocol. Therefore, a secure SPIR protocol has to address both forms of attacks. At the heart of multi-database SPIR is the availability of pre-shared secret keys, which are pre-distributed between the users and the data centres. With these pairwise secret keys, the user can securely send his/her queries, Q 1 and Q 2 , to the respective data centres, such that neither of the data centres can get both queries at the same time. Then, by also not allowing the data centres to communicate, one can enforce that neither of them can guess correctly x. Crucially, the use of secure channels also guarantees that no eavesdropper can get both Q 1 and Q 2 and hence x. These arguments collectively imply user privacy.
In the answer phase, it is important that the data centres do not reveal more than what is supposed to be given to the user. To achieve this, Gertner et al. [5] introduced the task of conditional disclosure of secrets (CDS). This is broadly described by a three-party task, where Alice and Bob, each with inputs y and z, are supposed to reveal a common secret s to Charlie, if and only if y and z satisfy a certain public predicate f (y, z). Indeed, using this task, one could then draw immediate connections and see that Q 1 and Q 2 correspond to y and z, respectively, and the common secret is the desired database entry w x . Hence, for CDS to work, some private shared randomness between the data centres is necessary and this is exactly given by the secret key pair (K 5 , K 6 ). These arguments thus imply that the user cannot get the correct secret if the queries are not the expected ones, which in turn provides the required database privacy. Table 1. Generic one-round two-database SPIR protocol. Step Answer:

Original SPIR Security Definition
At this point, it is useful to recap the original security definitions introduced by Gertner et al. [5]. A SPIR protocol is said to be secure if it satisfies the correctness, user privacy, and database privacy conditions. Since the setting here is purely classical, we assume that the output views are simply represented by random variables. More concretely, the view of the user is modelled by random variable V w U , and the view of the data centre j is modelled by V x D j , for j = 1, 2, where the dependence of V U (resp. V D j ) on w (resp. x) is explicitly labelled. Evidently, V U also contains query information, Q 1 and Q 2 , and communicated answersÃ 1 andÃ 2 , while V D j containsQ j and A j , for example.

Definition 1 (Correctness).
When all the parties in the protocol are honest, then for any database query x and database w, the protocol outputsŵ x = w x . Definition 2 (User Privacy). When the user is honest, then for any w and k 5 (or k 6 ), and for all x and x , each data centre's view satisfies Definition 3 (Database Privacy). When the data centres are honest, then for any x and r, there exist an x such that for all w and w with w x = w x , the view of the user satisfies The definition of correctness ensures that the protocol yields the desired result w x for the user. For user privacy, the trace distance measure is used as a distance metric for measuring the distinguishability of the views. To see this, suppose a hypothetical experiment where the data centre is randomly given two views, V x D j and V x D j , and has to determine which of the views is given to him. His maximum probability of guessing correctly the identity is directly linked to the trace distance, i.e., 1/2 From this expression, it is then clear that the trace distance quantifies the advantage the data centre has in distinguishing between V x D j and V x D j . Hence, having zero advantage in distinguishing between a system with x and one with x indicates that the data centre can gain no information about X. For database privacy, a dishonest user can input any x, since the adversarial queriesQ 1 andQ 2 may not depend on this particular choice of x. For instance, a dishonest user can use his local randomness R to choose queriesQ 1 andQ 2 that corresponds to queries for different x. For each r (i.e., each possible choice of queries), the information to which the user truly intends to learn would be implicitly carried byQ 1 andQ 2 . Therefore, the existence of an x such that the user cannot distinguish between w and w satisfying w x = w x for each r means that the user is unable to obtain any information beyond a single entry of the database, w x , for whichever queries that is randomly selected for that run.

QKD Channel
As mentioned above, our goal is to replace the perfectly secure communication channels assumed in multi-database SPIR with QKD channels. Before going into more details, it is useful to first recap the essential features of QKD and its security definitions.
The goal of QKD is to generate a pair of secure keys which are identical, uniform and secret from any eavesdropper. In this setting, we assume that the underlying QKD devices are honest and they each have a trusted local source of randomness. Below, we use random variable S instead of K to represent QKD keys.
A generic QKD between party A and party B can either succeed in producing a pair of keys, S A , S B ∈ S (with probability 1 − p ⊥ ), or abort and output an abort flag, S A = S B = ⊥ (with probability p ⊥ ). The average output state of a QKD protocol is hence given by where p ⊥ = P S A S B (⊥, ⊥) is the abort probability and σ s,s E is the quantum state conditioned on the outcome (s, s ) held by an eavesdropper at the end of the protocol. For brevity, we shall use ⊥ to label a normalised state that is conditioned on protocol aborting, and to label a normalised state that is conditioned on the protocol not aborting. For instance, in the above equation, the first term corresponds to p ⊥ ρ real,⊥ S A S B E , and the second term corresponds

QKD Security Definition
Keys generated from QKD may not be perfectly uniform and secret from the eavesdropper, but one can ensure that the keys are asymptotically close (in trace distance) to an ideal key by choosing the right security parameter. This security parameter is defined by the distinguishability of QKD keys from an ideal key. The ideal key described here is related, but slightly different from the secret key utilised for a secure classical channel. Since QKD channels can abort, the ideal key used for comparison has probability p ⊥ of returning an abort flag, whereas the process of sharing secret keys for secure channels are typically assumed not to fail. This introduces a loss in the robustness of the channel (i.e., it can sometime fail), but does not compromise channel security since protocol aborting does not provide Eve with any information on the message. The ideal output state of a QKD is given as where is the marginal state of Eve. Following Ref. [39], a QKD protocol is said to be ε-secure if the actual QKD and ideal output states satisfy The security of QKD can, in fact, be seen as the sum of two security criteria, namely correctness and secrecy. More specifically, it can be shown that, where the terms on the R.H.S. are the correctness and secrecy conditions, respectively, and they satisfy These criteria imply that ε = ε cor + ε sec . The correctness criterion, in practice, is typically enforced by using hashing, which guarantees that the two keys are identical except with some small error probability, ε cor /(1 − p ⊥ ). That is, given the protocol does not abort, the maximum probability that the generated keys are different is given by ( The secrecy criterion looks at how distinguishable the output state of either S A or S B is from the ideal output, after passing through the privacy amplification step using a quantum-proof randomness extractor. For more details of these criteria, we refer the interested reader to Ref. [39]. In the following, for simplicity, we assume that all QKD channels use the same security parameters, i.e., ε cor and ε sec , for these can be enforced in practice with the right error verification and privacy amplification schemes. The robustness probability is however harder to enforce as it depends on the quantum channel behaviour which can be different between channels. To that end, we will write p ⊥,U1 , p ⊥,U2 , and p ⊥,12 to represent the abort probabilities for QKD pairings (U, D 1 ), (U, D 2 ), and (D 1 , D 2 ), respectively.

SPIR with QKD Security Definition
In order to analyse SPIR protocols that utilise QKD keys, it is necessary to generalise the original SPIR security definition. These changes will have to accommodate aspects of a QKD channel that are not normally present in a perfectly secure channel. More specifically, we need to consider the possibility that the QKD protocol can abort, and that it has a non-zero probability of outputting an imperfect secret key pair.
In the original SPIR setting, a two-party protocol between the data centres and user is considered. Here, no external eavesdropper is included, for secure channels are used and hence no external party can obtain any information from the communication. However, in the case of practical QKD systems, there is a small possibility that the eavesdropper could learn something about the secret keys. To allow for such bad events, we look at SPIR as a three-party protocol with an eavesdropper called Eve, and introduce a fourth condition which we term as protocol secrecy. Similar to the other security conditions, the protocol secrecy condition requires that the view of any eavesdropper E be independent of both X and W, assuming both the user and data centres are honest. In the following, we first highlight four considerations when extending the original SPIR security definition to one that appropriately captures all possible bad events that may be caused by imperfect QKD keys.
Firstly, in analysing user privacy (resp. database privacy), the possibility of getting imperfect secret keys provides a new avenue for data centres (resp. the user) to gain more information on X (resp. W). For instance, when the key pair (S 3 , S 4 ) is insecure, data centre 1 can gain information on Q 2 and A 2 , which can be utilised to determine x. To suitably address these threats, we treat such situations as a collusion between the data centre and Eve (whose view contains the ciphertext). In other words, in analysing user privacy (resp. database privacy), we always assume that the dishonest party is colluding with the external eavesdropper, Eve.
Secondly, a feature of the current security definition of QKD is that the security error (the probability that the generated secret keys are imperfect/insecure) can be made arbitrarily small in the limit of infinitely long keys. To allow for this feature as well in the extended setting, we introduce compatible definitions by adding security parameters to each of the condition, all of which should be possible to make asymptotically small.
For instance, the security parameter for correctness, η cor , would bound the probability of error in recovering w x , the security parameters for user privacy, database privacy and protocol secrecy, η UP , η DP and η PS , would bound the difference between the two views given in the condition.
Thirdly, the possibility of having a mismatch of QKD keys for various communication channels would lead to inaccuracies when the classical SPIR definition is used. For user privacy, the classical definition requires the data centre's view to be independent of X for any k 5 , the shared random string between the databases. The definition also requires the same to be true for any k 6 , but this need not be included since K 5 = K 6 is assumed. Since QKD keys could be mismatched, S 5 = S 6 , S 6 has to be explicitly included in the adjusted definition. A similar problem is present for database privacy. The classical definition fixes x and r, thereby fixing the adversarial queriesq 1 andq 2 while analysing the user's view. This allows one to address any probabilistic strategy a dishonest user can perform by analysing each possible pair of queryq 1 andq 2 that the user includes in his probabilistic strategy. If the user is unable to obtain more than w x for some x for each pair of query, his probabilistic strategy would not yield more than a single entry of the database. Using QKD keys (S dec 1 ,S enc 2 ,S dec 3 ,S enc 4 ) can result in the queriesQ 1 andQ 2 arriving at the databases being probabilistic, since there is a small probability that the keys do not match. For instance, Q 1 andQ 2 can be queries for w 1 , but there is a small probability that the QKD keys are mismatched such thatQ 1 andQ 2 queries for w 2 , which means that there would not be an x for which the user's view is identical for any w and w with w x = w x . However, for each fixed set of QKD keys (s dec 1 ,s enc 2 ,s dec 3 ,s enc 4 ), the queries do indeed reveal at most a single w x to the user. Therefore, the definition has to be adjusted to analyse the user's view with fixed keys (s dec 1 ,s enc 2 ,s dec 3 ,s enc 4 ). Lastly, unlike secure communication channels, QKD protocols can fail due to reasons like high channel noise or failure to have matching hash values in the error verification step. In fact, even in the classical case, it is not inconceivable that an external party can perform denial of service attack on the channel, e.g., by physically cutting the optical channel. In such a situation, w x cannot be recovered and the correctness condition will not be met. To accommodate for such bad events, we modify the definition to condition out failure events (i.e., only consider 'pass' cases), which has probability This conditioning can be performed in practice since an abort flag, ⊥, is sent in the case of protocol failure. This is different from having an error in the decoded bitŵ x , which would be undetectable. Typically, once a QKD protocol aborts, the users will run the protocol again. However, for simplicity, we do not include this consideration in our analysis. Nevertheless, we remark that one should make p f ail as small as possible in practice.
The extended security definitions are as follow: Definition 4 (η cor -correctness). Assuming the user and the data centres are honest, then for any x and w, the protocol must fulfil Definition 5 (η UP -user privacy). Assuming the user is honest, then for any w and shared keys between the databases (s 5 ,s 6 ), the total view of each data centre and the eavesdropper (Eve) has to fulfil ∆(ρ Definition 6 (η DP -database privacy). Assuming the data centres are honest, then for any x, r and keys (s dec 1 , s enc 2 , s dec 3 , s enc 4 ), there exist an x such that for all w and w with w x = w x , the total view of the user and eavesdropper (Eve) has to fulfil ∆(ρ w UE , ρ w UE ) ≤ η DP .
Definition 7 (η PS -protocol secrecy). Assuming the user and the data centres are honest, then for all (x, w) and (x , w ), the view of the eavesdropper (Eve) has to fulfil ∆(ρ We call any SPIR protocol that satisfies the above four conditions as (η cor ,η UP ,η DP ,η PS )secure. Note that the original SPIR definition can be recovered by taking (0,0,0,0)-security and assuming that there is no protocol failure p fail = 0, that the shared random key between the databases are correct (S 5 = S 6 ), and the user queries are communicated without errors (S dec 1 = S enc 2 and S dec 3 = S enc 4 ). More concretely, Definition 1 is obtained since η cor = 0 and p fail = 0 implies Pr[ŵ x = w x ] = 0, Definitions 2 and 3 are obtained by noting that the trace distance measure is contractive under partial trace operations.

Quantum View Modelling
In Ref. [5], the authors proved that there exist a family of (0,0,0,0)-secure SPIR protocols assuming secure classical channels. However, establishing these secure channels require that the user and data centres have pre-shared keys that are at least as long as the messages to be sent. Pre-shared keys between the data centres are also required to perform CDS. This would be impractical for large databases or situations that require multiple uses of the SPIR protocol. Therefore, we can capitalise on QKD, which is a key expansion protocol. Starting with a small shared key between two parties, QKD can generate a much longer secret key for use. Hence, we establish QKD links between the parties to generate keys for both communication (between the user and data centres) and as shared randomness (between the data centres).
To analyse the security of the SPIR protocol with QKD, we need to first examine the view of various parties in the quantum setting. The protocol follows the generic one-round SPIR protocol described in Section 3.1, except that the keys used in key pairing steps are given by QKD keys instead. More specifically, we replace (K 1 , K 2 ), (K 3 , K 4 ), and (K 5 , K 6 ) by QKD generated keys (S 1 , S 2 ), (S 3 , S 4 ), and (S 5 , S 6 ), respectively. We also take that each set of QKD keys shared between two parties is generated by a single round of QKD. If any of the three QKD protocols aborts, i.e., if any of (S 1 , S 2 ), (S 3 , S 4 ) or (S 5 , S 6 ) returns ⊥ after the first step of establishing secure channels, then the SPIR protocol will abort. For simplicity, we take that all random variables that are generated in the latter steps, including queries, answers and ciphertext, are set to ⊥. The overall protocol is summarised in Table 2.
By expressing the inputs as quantum states and steps in the protocol as maps, we can obtain the final state for all four parties, and determine each of their view by performing a partial trace. Here, we introduce four relevant views that are used in the SPIR security definition. The total view of the user and Eve (used in database privacy) is the total view of Eve and data centre 1, and that of Eve and data centre 2 (used in user privacy) are respectively, and the view of Eve (used in protocol secrecy) is Here, we note that E is the side-information of Eve gathered up the OTP steps. As such, E contains all of the quantum information exchanged over the QKD channels and all of the classical information exchanged due to error correction, verification, and privacy amplification. Table 2. Generic one-Round two-database SPIR protocol with QKD. Step Query: Answer:

Security Analysis
Here, we show that the security parameters of the associated QKD protocols can be used to bound the generalised SPIR security parameters defined above.
Proof sketch.-For the correctness condition, if the all of the QKD key pairs are correct and conditioned on not aborting, then the 0-correctness of the SPIR protocol guarantees that the decoding will be correct. Moreover, since there may be key pair events other than the correct ones that can yieldŵ x = w x , we have that where the conditioning is that all of the QKD protocols do not abort. Then, by using the union bound, it is straightforward to show that the probability of error is upper bounded by the sum of the probability of each QKD key being wrong, and thus For user privacy, we look at the total view of one data centre (say D 1 ) together with the eavesdropper, E. However, it is not straightforward to compare the views for different x. Hence, we introduce an hypothetical scenario which uses an ideal QKD protocol instead of a real QKD protocol to generate keys for OTP encryption through C U2 as an intermediate step. This state, ξ x has the same set of variables as ρ x D 1 E in Equation (13), with the only difference being that QKD keys S 3 S 4 are ideal. With this intermediate state, we can split the trace distance into three parts by using triangle inequality, . When the protocol aborts, the two views are clearly identical (i.e., zero trace distance) since all variables have value ⊥, ex-cept the keys S 1 S 5 E, which are common for both states. In fact, for all trace distances we examine in this sketch proof, the two states in the trace distance are identical when the protocol aborts, and thus we ignore the protocol abort situation. When the protocol does not abort, we can simplify by using the fact that any trace-preserving map cannot increase trace distance, and noting that there are trace-preserving maps from Q 1 S 1 S 2 S 5 W toQ 1Ā1 C Q 1 CĀ 1 . Moreover, since the ciphertext C Q 2 CĀ 2 is obtained from encryption using ideal QKD keys S 3 S 4 , they are uniformly distributed over C Q 2 C A 2 , and thus are independent of x and common to both states. After simplification, the only remaining variable in the trace distance possibly dependent on x is Q 1 (the other remaining variables are S 1 S 2 S 5 WE). However, by 0-user privacy of the SPIR protocol, Q 1 is independent of x and thus ∆(ξ . Conditioned on protocol not aborting, we can simplify by noting that there are trace-preserving maps that can map Since Q 1 Q 2 are independent of the QKD keys, and S 1 S 2 S 5 S 6 are generated by same QKD protocol, we are left with the trace distance where labels the conditioning on the protocol not aborting. In the R.H.S. of the equation, the first state (resp. second state) corresponds to real QKD keys (resp. ideal QKD keys) S 3 S 4 with side information E = S 1 S 2 S 5 S 6 E conditioned on the protocol not aborting. Therefore, from the security definition, the trace distance is bounded by ε cor + ε sec . Combining the above results, one can show that This also holds for the total view of D 2 E. For database privacy, we examine the total view of the user, U, together with the eavesdropper Eve, E. We then introduce a hypothetical scenario where ideal QKD keys are used instead of real QKD keys as the shared random string between the data centres. The corresponding state, ξ w UE = ξ w XRQ 1Q2Ã1Ã2 S 2 S 4 CQ 1 CQ 2 C A 1 C A 2 E , contains the same variables as ρ UE in Equation (12), except that S 5 S 6 are ideal QKD keys. Therefore, we can use triangle inequality to split the trace distance into three parts, ∆(ρ w UE , ξ w UE ), ∆(ξ w UE , ξ w UE ), and ∆(ξ w UE , ρ w UE ). We first examine the second part, ∆(ξ w UE , ξ w UE ) for an arbitrary x, r and (s dec 1 ,s enc 2 ,s dec 3 ,s enc 4 ). This can be simplified by noting that there is a trace-preserving map fromQ 1Q2 Since a fixed r and x fixesq 1 andq 2 and having fixed keys (s dec 1 ,s enc 2 ,s dec 3 ,s enc 4 ) further fixes the query received by the database,q 1 andq 2 , we can express the state as two subsystems XRQ 1Q2 S 1 S 2 S 3 S 4 E andQ 1Q2 A 1 A 2 . The former subsystem is independent of W, and thus we can remove it using the fact that ∆(A ⊗ B, A ⊗ B) ≤ ∆(B, C). The probability distribution ofQ 1Q2 A 1 A 2 here is the same as a hypothetical scenario where all QKD keys are ideal, and the user sends the queriesQ 1 andQ 2 instead. For this scenario, we can invoke 0-database privacy, which states there exists an x such that for w and w with w x = w x , A 1 and A 2 are independent on W (i.e., trace distance is zero). This is true for any adversarial user queries, and in particular it is true for queriesQ 1 andQ 2 .
The next step is to examine the trace distance ∆(ρ w UE , ξ w UE ). We note that there are tracepreserving maps that can be applied toQ 1Q2 S 1 S 2 S 3 S 4 S 5 S 6 W to obtain A 1 A 2 CQ 1 CQ 2 C A 1 C A 2 . This simplification, together with removal of common terms XRQ 1Q2 W, and noting that S 1 S 2 S 3 S 4 is generated by the same QKD protocol for both terms, we arrive at where the side-information is E = S 1 S 2 S 3 S 4 E. The terms in the trace distance corresponds to the output state of a real and ideal QKD protocol respectively conditioned on not aborting. Therefore, from the security definition, this is bounded by ε cor + ε sec . Combining the above results, we conclude that there exist a x such that for w x = w x , The final condition of protocol secrecy requires the introduction of the view of the eavesdropper for two different scenarios. ξ x,w,1 E is Eve's view in a setup where S 1 S 2 are ideal QKD keys, and ξ x,w,2 E is Eve's view where S 1 S 2 S 3 S 4 are ideal QKD keys. Using similar arguments from the sketch proof of user privacy, one can show that each change from The next step is to examine the trace distance ∆(ξ x,w,2

E
, ξ x ,w ,2 E ). We note that ξ x,w,2 (15), except that S 1 S 2 S 3 S 4 are ideal QKD keys. Since C Q 1 C Q 2 C A 1 C A 2 are ciphertext generated using ideal QKD keys S 1 S 2 S 3 S 4 , they are distributed uniformly over C Q 1 C Q 2 C A 1 C A 2 . Therefore, they are not dependent on x or w (neither is E), and the trace distance is ∆(ξ x,w,2

E
, ξ x ,w ,2 E ) = 0. Using triangle inequality to combine the result, we have The detailed proof is provided in Appendix A.

MDI-QKD
For simulation purposes, we look at MDI-QKD with decoy states [40] as the protocol of choice to generate the keys used in SPIR. In MDI-QKD, the security of the QKD key generated is guaranteed even if the eavesdropper is the one performing the measurement and announcing the result, as shown in Figure 2. Hence, in the setup depicted in Figure 1, the central node would hold the measurement device and the other parties would hold the QKD source. In this case, the MDI nature of the protocol ensures that the central node cannot gain any information about the messages communicated between the user and data centres.
The MDI-QKD protocol we use is detailed in Ref. [40], and we provide a summary here. We start with the communicating parties, Alice and Bob, each choosing a basis from {X, Z}, an intensity from {a s , a 1 , . . . , a n } and {b s , b 1 , . . . , b m } respectively, and a random bit {0, 1}. They then prepare the corresponding quantum state and send it to the central node. If the central node is honest, it will perform a Bell state measurement and report the result, t. Alice and Bob can then reveal their basis and intensity settings and only select rounds where they use the same basis states. This sifted key can then be used for parameter estimation, error correction and privacy amplification. The final key rate obtained is given by the sum of key rates for different results reported by the central node, l = ∑ t l t , where h(x) is the binary entropy of x, n t,0 is the number of events where either party sends zero photons, n t,1 is the number of events where both parties send one photon each, e t,1 is the error rate for these one-photon events, leak EC,t is the number of leaked bits from error-correction, and the ε values are various security and parameter estimation parameters.

SPIR Resource
We examine the performance of the SPIR protocol based on the type of database it can send for a fixed number of signals sent to establish the QKD key, N, and for fixed distances, d. A database is characterised by the number of entry it has, n, and the size of each entry, L. We use the two-database SPIR protocol B 2 [5] (see Appendix B for protocol description), which requires communication of [7L + 3 log n 1/3 + (3 + 3L)n 1/3 ] bits between the user and each data centre, and (9Ln 1/3 + 10L) bits of shared key between the data centres for CDS. In a typical implementation, it is likely that the two data centres would be close together, thus the limiting factor would be from the user-data centre communication since the user would tend to be far from the data centre itself. Hence, we will only focus on the the key rate from MDI-QKD between the user and data centres.
In the analysis, we use similar parameters as in Ref. [40], with a fibre channel loss of 0.2 dB km −1 , detection efficiency of 14.5%, and background count of 6.02 × 10 −6 . We assume that the central node uses the measurement device shown in Figure 3, which allows it to perform Bell state measurements of states |ψ − and |ψ + . The polarisation misalignment error of this setup is modelled following Ref. [41], by introducing unitary rotations in the channels connecting Alice and Bob to the central node, and a unitary rotation in one arm of the measurement device after the beam splitter. The value of the total polarisation misalignment error is set at 1.5%. For simplicity, the protocol uses only two decoy states, with the weaker one having intensity 5 × 10 −4 . We also assume that the error correction leakage is given by leak EC,t = 1.16n t h(e a s b s t ), where n k is the number of bits of the sifted key (runs that both Alice and Bob prepares in the Z-basis and using the signal intensity) that is not used for error estimation, and e a s b s t is the corresponding error rate of this sifted key. We fix the QKD security parameters ε corr =10 −15 and ε =10 −10 , which makes the SPIR (3 × 10 −15 , 2 × 10 −10 , 2 × 10 −10 , 4 × 10 −10 )-secure. The key rate l/N is optimised for a given number of signals sent in the QKD key generation, N, over all free parameters. These include the intensities, probability distributions of intensity and basis choices, number of bits used for error estimation, and the security parameters implicit in ε. We plot the database parameters for a few setups, with the number of signal sent, N, being 10 12 , 10 13 , and 10 14 , which corresponds to 16.7 min, 2.8 h, and 28 h respectively for a 1 GHz signal rate. The distances used are metropolitan, at 5 km (fits Singapore's downtown core), 10 km (fits Geneva, London inner ring road), and 20 km (fits Washington DC). We also included four scenarios of database query usage, • iTunes: A consumer wants to purchase a song from the iTunes catalogue, which contains 60 million songs. (Assume each music file is 10 MB) [n = 6 × 10 7 , L = 8 ×  Figure 4.
The B 2 protocol with QKD has a scaling of O(n 1/3 L), which is reflected in the numerical analysis by the significantly higher number of database entries that one can perform SPIR for compared to the database entry size, which scales linearly with N. This means that the B 2 protocol is especially useful for databases with small file sizes and large number of entries, such as querying the fingerprint of one person from a database containing the fingerprint of everyone in the world, which takes about 16.7 min of key generation for 10 km distances. For much larger database entries, such as video files, and uncompressed music files, the use of the B 2 protocol with QKD channels does not appear feasible. . Four points are included that represents the database parameters of the usage scenarios described in the main text. The diagram also includes a plot for an alternative protocol that requires a more relaxed SPIR definition discussed in Section 7.

Discussion
Having a multi-database SPIR protocol with QKD provides information theoretic security, but a drawback in the setup is that the result obtained by the user,ŵ x , cannot be verified. This allows malicious data centres to send false information to the user simply by changing the answers sent to the user. This, however, does not affect the validity of the SPIR protocol. At the practical level, this act could be detectable for certain applications, such as music streaming, but could remain undetected for other applications such as medical test reports, where information cannot be independently verified by the user. One could overcome this by providing additional information, such as a hash of the desired entry, for the user to perform verification, but this requires a further analysis which is beyond the scope of the current work.
In place of ideal keys, we have introduced the use of QKD keys for use in SPIR, but we require a few addition assumptions on the parties. In particular, we assume that (1) the data centres do not intentionally leak the QKD keys to other parties including Eve, (2) that all messages sent through the channels C Uj must be encrypted with OTP, and that (3) data centres do not have access to the classical channels used to establish the QKD keys after the key exchange step. These additional assumptions are necessary to prevent the misuse of QKD, which assumes that both communicating parties act honestly. These assumptions can be enforced in practice via methods like supervisory programs or a trusted third party authority.
In our numerical analysis, we used the B 2 protocol, but there are other SPIR protocol that one could use. B k protocol is a generalisation of the B 2 protocol that requires k databases instead of a two, with a scaling of O(n 1/(2k−1) L). This means that it outperforms the B 2 for applications with a large number of database entries, but the user would have to communicate with more data centres.
Alternatively, one could relax the SPIR definition to allow for other protocols to be used. In the current SPIR definition, the user is not allowed to learn the values of the XOR of database entries such as w x ⊕ w x . However, in certain scenarios the data centre might not mind the user learning such values, as long as the user only gains one bit of information, e.g., either w x or some x w x . Such a change would require further modification of Definition 6, for instance, to one that reads "there exist an i (n) = (i 1 , . . . , i n ) such that for all w and w with x i x w x = x i x w x ", where i x = 1 indicates that the user includes w x in the XOR the user learns and i x = 0 otherwise.
The relaxation made to the SPIR definition would allow us to use another protocol used as the foundation for Song et al.'s quantum SPIR protocol [26]. In this protocol, we label the user's desired bit as w i (n) = n x=1 i x w x . The user then generates a random string R (n) ∈ {0, 1} n and sends his queries Q (n) . The data centres then reply with answers A 1 = ( n x=1 Q 1,x w x ) ⊕ K and A 2 = ( n x=1 Q 2,x w x ) ⊕ K, where K is a shared random bit between the data centres. The user would then decode by applying A 1 ⊕ A 2 , and K ensures that the user can only obtain at most a single bit. In this setup, the number of bits of communication between the user and data centre is n + L, and the plot is shown in Figure 4, for N = 10 13 at 10 km. This protocol can be utilised for iTunes and EHR, which is not feasible for the B 2 protocol. The protocol can also achieve close to the communication limit of L = l for small databases. This limit is that of the secure communication of a single string (entry) of length L, which requires one QKD secure key bit for each bit of the string. However, the number of entries that the database can have is limited in this case, and it can no longer be used for the fingerprint database which has 7.7 billion entries. Therefore, it can be useful to examine other protocols of SPIR or relaxed versions of SPIR.
Here, we have shown how multi-database SPIR can work with QKD channels in place of secure channels. An interesting extension would be to demonstrate it experimentally, which would pave the way for practical implementation of the protocol in the future. For practical implementation, it is also useful to explore reasonable relaxations of the QKD protocol, such as the one described above, and other SPIR or relaxed SPIR protocols. By optimising the protocol choice for different applications of SPIR based on the number of entries and database entry size of the database, one could obtain better performance for the particular application of interest.
Another interesting extension would be to examine the performance of SPIR in the situation of a byzantine adversary who may corrupt transmission for some of the communication channels, and the scenario where this adversary can collude with some data centres. This situation results in communication between the data centres, which could compromise user privacy, and inaccurate answers being sent to the user due to corrupted transmission, which could affect the correctness of the protocol. The classical case was examined by Wang et al. [23], where they also looked at the scenario where an eavesdropper that can tap into the communication channels, but this problem has been addressed in this paper with QKD. It is thus interesting to explore if the quantum nature of the byzantine adversary and the colluding data centres could have an impact on SPIR implementation with QKD channels. The SPIR solution to this scenario would provide additional security for the user.

Conclusions
We have introduced the use of QKD in place of secure channels in SPIR, since classical secure channels are difficult to achieve in practice. To show that replacing the classical secure channel with QKD channels does not compromise security, we extended the original SPIR definition to include aspects of QKD that are not normally present in a secure chan-nel. These include the presence of an external eavesdropper who may tap into classical communication and eavesdrop on the quantum channel, having security parameters due to the possibility of having an imperfect secret key and considering that the QKD protocol may abort. Using the extended SPIR definition, we then show that the SPIR security parameters are related to the QKD security parameters, ε sec and ε cor , which can be set arbitrarily close to zero. This implies that one could have a SPIR protocol using QKD keys with arbitrarily good security. Using MDI-QKD and B 2 protocol as an example, we also show how such a SPIR protocol, specifically B 2 , can be feasible by numerically simulating the QKD key rates.

Acknowledgments:
We thank Chao Wang, Ignatius William Primaatmaja, and Koon Tong Goh for their comments and useful suggestions. We also thank the referees from the Quantum Journal for their constructive comments.

Conflicts of Interest:
The authors declare no conflict of interest.

Abbreviations
The following abbreviations are used in this manuscript:

PIR
Private information retrieval SPIR Symmetric private information retrieval QKD Quantum key distribution CPTP Completely positive and trace preserving POVM Positive operator value measurement OTP One-time pad CDS Conditional disclosure of secrets MDI Measurement-device independent Appendix A. Detailed Security Proof Theorem A1. A two-database one-round (0,0,0,0)-secure SPIR protocol that uses ε-secure QKD keys in place of ideal keys, where ε = ε cor + ε sec , is 3ε cor -correct.

Proof.
We start by noting that when all the QKD keys are correct, S 1 = S 2 , S 3 = S 4 , and S 5 = S 6 , answers generated by the data centres and messages sent through the channels would be correct. From the 0-correctness of the classical SPIR protocol, this means that the user would be able to decode correctly,ŵ x = w x . Therefore, we have the result in Equation (16). Taking the complement of Equation (16) gives where the second inequality is an application of the union bound. This can be directly related to ε cor of each channel to give the correctness condition, where the second inequality is obtained noting that the probability that the SPIR protocol would abort, p fail , is larger than the probability that any one QKD protocol aborts, p ⊥ .
Proof. Here, we only provide the security analysis with respect to data centre 1, which can act dishonestly; the same procedure holds for data centre 2. To compare the total view of D 1 and E for different user desired index, ρ x D 1 E and ρ x D 1 E , we first have to introduce an intermediate state, ξ x D 1 E . This state corresponds to a setup in which an ideal QKD key is generated from the QKD protocol for communication between D 2 and U. Using the triangle inequality property of the trace distance measure, we split the user privacy condition into three parts, We start by examining the second term on the R.H.S., which is the trace distance between two views where the secret key pairs used are (S 1 , S 2 ) and (S 5 , S 6 ) from the actual QKD protocols, and (S 3 , S 4 ) from an ideal QKD protocol, but with differing user index choices x and x . Following Equation (13), we have that where the label indicates that the state is conditioned on the QKD not aborting (i.e., All QKD keys are not ⊥) and ⊥ indicates that the state is conditioned on QKD aborting. We note that the state conditioned on aborting would have all terms being ⊥ except possibly the QKD keys and W. Therefore, it is clear that this is independent of X, Then, by noting the following trace-preserving mappings and using the jointly convex property of trace distance, we further get At this point, we note that C Q 2 and C A 2 are encrypted with an ideal secret key and hence is uniformly distributed whenever the protocol does not abort. More specifically, C Q 2 (resp. CĀ 2 ) is uniformly distributed over C Q 2 (resp. CĀ 2 ) with probability 1 − p fail . With this, we can expand the trace distance to get Note that Q 1 and S 1 S 2 S 5 WC Q 2 CĀ 2 E are independent of each other, and that S 1 S 2 S 5 W C Q 2 CĀ 2 E is independent of X. In fact, C Q 2 and CĀ 2 contains no information about Q 2 and A 2 and thus none of X as well. Thus, this gives us The second inequality is due to the fact that Q 1 is diagonal, which means that the trace distance between probability distribution of Q 1 coincides with the quantum state, and that Q 1 is part of the view V x D 1 . Since Q 1 is generated by a honest user and thus independent on the type of channel used in the protocol, the last equality holds due to 0-user privacy of the classical protocol.
Let us now examine the first term on the R.H.S. of Equation (A3), Here, we note that the following trace-preserving mappings are applied to Q 2 S 3 S 4 S 6 W to get C Q 2 CĀ 2 , Therefore, we get ∆ ρ x, We note that Q 1 Q 2 are the only systems that depend on x and that they are independent of S 1 S 2 S 3 S 4 S 5 S 6 E; recall that Q 1 Q 2 are created independently after the QKD steps. Moreover, W = w is fixed and is common to both states. These arguments thus gives us Here, we can further partition S 1 S 2 S 3 S 4 S 5 S 6 E into two parts, S 3 S 4 and S 1 S 2 S 5 S 6 E, and note that S 1 S 2 S 5 S 6 is common to both setups (generated using real QKD protocol). With this, we may view the latter as some extended side-information E = S 1 S 2 S 5 S 6 E. Then, using the security definition of QKD (by replacing E by E ), we get that since ξ S 3 S 4 E is an ideal QKD output state conditioned on not aborting. Combining the results, we obtain Theorem A3. A two-database one-round (0,0,0,0)-secure SPIR protocol that uses ε-secure QKD keys in place of ideal keys, where ε = ε cor + ε sec , is 2ε-database private.
Proof. We start the proof by fixing an arbitrary x since the adversarial queriesQ 1 andQ 2 sent by the user need not depend on x in general. Similar to the analysis of user-privacy, we first introduce an intermediate view, ξ w UE , that corresponds to a setup in which the QKD channel between the data centres generates an ideal output state. Using this state, we can then expand the trace distance in the database privacy condition using the triangle inequality, where for some x we have that w = w but with w x = w x . To start with, we examine the second term on the R.H.S. From Equation (12), we have Then, given the following trace-preserving classical mappings, and using the jointly convex property of trace distance, we get We note that in the definition of database privacy, the trace distance is examined for a fixed (but arbitrary) x, r and cryptographic keys (s dec 1 ,s enc 2 ,s dec 3 ,s enc 4 ). Hence, we can express where we note that the adversarial queriesq 1 andq 2 are fixed by r and possibly x. Sincẽ Q 1 =Q 1 ⊕ S enc 2 ⊕ S dec 1 andQ 2 =Q 2 ⊕ S enc 4 ⊕ S dec 3 , and given that the queries and keys are fixed, we can introduce Π˜Q 1Q2 (q 1q2 ) into the state, giving Since the subsystem XRQ 1Q2 S 1 S 2 S 3 S 4 E is independent of w and the subsystemQ 1Q2 A 1 A 2 is independent on S enc 1 S dec 2 S enc 3 S dec 4 , we can remove XRQ 1Q2 S 1 S 2 S 3 S 4 E using the fact that Since the answer functions are not dependent on the channel type (ideal or real QKD), we can equivalently view the system ξ w, as one where there are ideal keys. In this case, the user sends the adversarial queriesQ 1 andQ 2 , and receives the corresponding answer A 1 and A 2 . Therefore, there exist a x such that ∆ ξ w, where the inequality is due to the fact the state is diagonal inQ 1Q2 A 1 A 2 , and that Q 1Q2 A 1 A 2 is part of the user's view for a setup with user queryQ 1Q2 and secure channels. By invoking the 0-database privacy of such a setup, there exist a x where the equality holds. We can therefore conclude that for any x, r and keys (s dec 1 ,s enc 2 ,s dec 3 ,s enc 4 ), there exist an x such that Let us now examine the first term on the R.H.S. of Equation (A14). Likewise, we have that We note that the following trace-preserving mappings are applied toQ 1Q2 S 1 S 2 S 3 S 4 S 5 S 6 W to get A 1 A 2 , Therefore, we obtain We note that XRQ 1Q2 W is independent of S 1 S 2 S 3 S 4 S 5 S 6 E, and are common to both states. This thus gives us We can further partition S 1 S 2 S 3 S 4 S 5 S 6 E into two parts, S 5 S 6 and S 1 S 2 S 3 S 4 E, and note that S 1 S 2 S 3 S 4 for both states are generated using real QKD protocol. With this, we may view the latter as some extended side-information E = S 1 S 2 S 3 S 4 E. Then, using the security definition of QKD, we get that since ξ S 3 S 4 E is an ideal QKD output state conditioned on not aborting. Note that this is true for any x . Combining the results, we conclude that there exist an x such that Theorem A4. A two-database one-round (0,0,0,0)-secure classical SPIR protocol that uses ε-secure QKD keys in place of ideal keys, where ε = ε cor + ε sec , is 4ε-protocol secret.
We begin with examining the third term on the R.H.S. From Equation (15), we get Using the jointly convex property of trace distance, we obtain Since ideal QKD keys are used between all parties, C Q 1 , C Q 2 , C A 1 , and C A 2 are uniformly distributed over C Q 1 , C Q 2 , C A 1 , and C A 2 respectively conditioned on protocol not failing. With this, we can expand the trace distance to get Let us now examine the second term on the R.H.S. of Equation (A27). We first obtain Since ideal QKD keys (S 1 , S 2 ) are used, C Q 1 and C A 1 are uniformly distributed over C Q 1 and C A 1 respectively, conditioned on the protocol not failing. With this, we can expand the trace distance to get ∆(ξ x,w,1,

E
, ξ x,w,2, We note that the following trace preserving map can be applied to Q 2 WS 3 S 4 S 6 to obtain C Q 2 C A 2 , Therefore, we get ∆(ξ x,w,1,
We can further partition S 3 S 4 S 6 E into two parts, S 3 S 4 and S 6 E, and note that S 6 for both states are generated using real QKD protocol. With this, we may view the latter as some extended side-information E = S 6 E. Then, using the security definition of QKD, we get that ∆(ξ x,w,1,
We next examine the first term on the R.H.S. of Equation (A27). We first obtain ∆(ρ x,w E , ξ x,w,1 E ) ≤ (1 − p fail )∆(σ x,w, C Q 1 C Q 2 C A 1 C A 2 E , ξ x,w,1, We note that the following map can be applied on Q 1 Q 2 WS 1 S 2 S 3 S 4 S 5 S 6 to obtain C Q 1 C Q 2 C A 1 C A 2 , Therefore, we get ∆(σ x,w, C Q 1 C Q 2 C A 1 C A 2 E , ξ x,w,1, Since Q 1 Q 2 W is independent on S 1 S 2 S 3 S 4 S 5 S 6 E, and is common to both terms (with same x and w), we obtain ∆(σ x,w, Q 1 Q 2 WS 1 S 2 S 3 S 4 S 5 S 6 E , ξ x,w,1, We can further partition S 1 S 2 S 3 S 4 S 5 S 6 E into two parts, S 1 S 2 and S 3 S 4 S 5 S 6 E, and note that S 3 S 4 S 5 S 6 is common for both states. With this, we may view the latter as some extended side-information E = S 3 S 4 S 5 S 6 E. Then, we get that Combining the results, we obtain

Appendix B. B 2 Protocol
For simplicity, we consider a database with size n = m 3 , with one-bit database entries, W = (w 1 , . . . , w n ) ∈ {0, 1} n . We label the entries with index X = (X 1 , X 2 , X 3 ), where X i ∈ {1, . . . , m}, for i = 1, 2, 3. The user has a source of local randomness labelled by R = (R s , R d ). R s consists of three random subsets, R i s ⊆ {1, . . . , m} (which can be expressed as a random m-bit vector as well), and R d is a set of three values, R i d ∈ {1, . . . , m}. Furthermore, we label the pre-shared keys, between the two data centres, K 3 K 4 , by (U, T, Y, Z), which are used for CDS. We also define the notation for a set S. We first define the query used in the B 2 protocol. The user first selects a desired index x = (x 1 , x 2 , x 3 ), and generates the local random values R s and R d . Query to data centre 1 is simply Q 1 = (Q 1,s , Q 1,d ), where Q 1,s = R s and Q 1,d = R d . For the query to data centre 2, the user has to compute Q i 2,d ≡ x i − R i d (mod m), and Q 2,s = R i s {x i }. The query is thus Q 2 = (Q 2,s , Q 2,d ). Essentially, the user encodes his desired index in both the set query as the only element that is contained exclusively in Q 1,s or Q 2,s and the index query as the sum of Q 1,d and Q 2,d modulo m.
The data centre answers consist of 8 portions, which are labelled by index σ = {0, 1} 3 , and one portion responsible for CDS to ensure that the user provides valid queries. The keys used for masking the responses are U and T. U consists of 3 random bits, U i , T consists of 8 bits, T σ , of which 7 are random, and the final bit is chosen to ensure σ T σ = 0.
Keys that are used for CDS are Y and Z. Y is a set of 6 vectors of length m, Y σ , for σ = {001, 010, 100, 011, 101, 110}, and Z is a set of 3 vectors of length m, Z i . Data centre 1 then computes the answers for j ∈ {1, . . . , m} and i = 1, 2, 3, where I S is the indicator function of set S (i.e., I S j = 1 if j ∈ S and I S j = 0 if j / ∈ S).
The computed values, together with three additional bits Y 011 , forms the answer A 2 .
The decoding function is obtained by simply performing an XOR on some of the answer bits received by the user. If the user is honest, the correct value ofŵ x can be obtained from the decoding function. Firstly, by taking the sum of the CDS answers, we can retrieve the value of Z x using Since Q i 1,d + Q i 2,d ≡ x i (mod m), the dependency of A σ on Y σ can be removed by choosing j = x i for the appropriate i. The final decoding would thus bê