Next Article in Journal
Thermodynamic, Economic, and Environmental Multi-Criteria Optimization of a Multi-Stage Rankine System for LNG Cold Energy Utilization
Previous Article in Journal
SAPEVO-H2 Multi-Criteria Modelling to Connect Decision-Makers at Different Levels of Responsibility: Evaluating Sustainability Projects in the Automobile Industry
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

PVkNN: A Publicly Verifiable and Privacy-Preserving Exact kNN Query Scheme for Cloud-Based Location Services

1
College of Computer Science and Technology, Qingdao University, Qingdao 266071, China
2
College of Big Data and Internet, Shenzhen Technology University, Shenzhen 518118, China
*
Authors to whom correspondence should be addressed.
Modelling 2025, 6(2), 44; https://doi.org/10.3390/modelling6020044
Submission received: 11 April 2025 / Revised: 19 May 2025 / Accepted: 30 May 2025 / Published: 3 June 2025

Abstract

The k-nearest- neighbor (kNN) algorithm is crucial in data mining and machine learning, yet its deployment on large-scale datasets within cloud environments presents significant security and efficiency challenges. This paper is dedicated to advancing the resolution of these challenges and presents novel contributions to the development of efficient and secure exact kNN query schemes tailored for spatial datasets in cloud-based location services. Addressing existing limitations, our approach focuses on accelerating query processing while ensuring robust privacy preservation and public verifiability. Key contributions include the establishment of a formal framework underpinned by stringent security definitions, providing a solid groundwork for future advancements. Leveraging Paillier’s homomorphic cryptosystem and public-key signature techniques, our design achieves heightened security by safeguarding databases, query access patterns, and result integrity while enabling public verification. Additionally, our scheme enhances computational efficiency through optimized data-packing techniques and simplified Voronoi diagram-based ciphertext index construction, leading to substantial savings in computational and communication overheads. Rigorous and transparent theoretical analysis substantiates the correctness, security, and efficiency of our design, while comprehensive experimental evaluations confirm the effectiveness of our approach, showcasing its practical applicability and scalability across datasets of varying scales.

1. Introduction

The k-nearest-neighbor (kNN) algorithm is a fundamental data mining and machine learning tool used for classification and regression tasks. It identifies the k closest points in a dataset to a given query point and makes decisions based on the properties of these points. In large-scale datasets, particularly those hosted in cloud environments, kNN queries present unique challenges and opportunities in terms of efficiency and privacy. When dealing with large-scale datasets, the computational and memory requirements for performing kNN queries can be substantial. The sheer volume of data can overwhelm local processing capabilities, making cloud-based solutions attractive due to their scalability and resource availability. However, distributing this process across a cloud infrastructure introduces latency and potential bottlenecks in data transmission and processing [1].
Privacy is a critical concern when sensitive data is processed or stored outside local premises, such as in cloud environments. Traditional kNN implementations may expose private data to cloud providers, increasing the risk of unauthorized access and breaches. The verifiability of query results is another significant concern. During information processing, a malicious cloud provider could conceal or alter information, leaving the user unsure about the accuracy of the results and unable to assess them with certainty. Therefore, it is crucial to design privacy-preserving and verifiable mechanisms that protect data and ensure the integrity of query results while still leveraging the cloud’s computational power. To address these challenges, many privacy-preserving and verifiable designs for kNN have been developed [2,3,4,5,6].
To ensure the confidentiality of data during the query process, a range of techniques are employed, including encryption, data anonymization, and secure multi-party computation. For instance, homomorphic encryption allows computations to operate on encrypted data, yielding encrypted results that are decipherable solely by the data owner [3]. Another method involves distributing data across multiple independent servers, each processing segments of the kNN computation without access to the underlying data or final query outcomes [7]. To ensure the verifiability of query results, existing approaches incorporate cryptographic hash functions and signature-based authentication technologies [3,8]. Moreover, to enhance query efficiency, protocols often integrate cryptographic methods with data-efficient structures and algorithms such as tree-based dimensionality reduction. This technique simplifies data while preserving essential characteristics relevant to kNN queries. Additionally, indexing techniques can be tailored to function with encrypted data, thereby reducing query response times without compromising security [9].

1.1. Related Works

In recent years, researchers have proposed various schemes to enable privacy-preserving and verifiable kNN queries on data outsourced to untrusted cloud servers. Depending on their security functionalities, cloud-assisted kNN schemes can be categorized as privacy-preserving or verifiable schemes.
Privacy-preserving kNN schemes without verifiability: Privacy-preserving cloud-assisted kNN schemes aim to protect user data while leveraging cloud resources for query processing. These schemes are essential in scenarios where sensitive data is stored in an honest-but-curious cloud and needs to be analyzed without compromising individuals’ privacy. Traditionally, a secret distance-preserving transformation (DPT) was used to ensure data privacy [10]. However, concerns were raised by Wong et al. [11] regarding vulnerabilities to known-sample attacks ( KSA ʃ ) and known-plaintext attacks ( KPA ʃ ) in DPT-based privacy-preserving kNN query schemes. In response, they proposed an asymmetric scalar product preservation encryption (ASPE) method. Hu et al. [12] introduced a kNN query processing scheme utilizing lightweight order-preserving encryption (OPE) and deterministic random encryption (DRE) to effectively support kNN queries of encrypted data. Unfortunately, subsequent analyses revealed vulnerabilities to chosen-plaintext attacks [13]. Choi et al. [14] and Wang et al. [15] adopted mutable order-preserving encryption (OPE) [16] to improve security. However, these schemes assumed mutual trust between the data owner (DO) and query user (QU), which introduced a serious risk of privacy disclosure of the DO’s key. Recognizing potential mutual distrust among the DO, QU, and cloud server (CS), Zhu et al. [17,18] proposed solutions for securely outsourcing kNN queries by using random blind techniques and Paillier’s additively homomorphic cryptosystem. However, their schemes were susceptible to known-plaintext attacks and failed to conceal data access patterns. Lei et al. [2] addressed outsourced kNN queries in spatial databases, employing vector-based projection technology for data preprocessing but sacrificing query accuracy. Subsequent improvements [19] aimed at enhancing accuracy but led to increased communication complexity. Further advancements highlighted insecurities in the ASPE method against ciphertext-only attacks [20], prompting the proposal of a 1NN scheme against adaptive chosen-keyword attacks, but it lacked support for secure kNN queries. Recently, Zheng et al. [21] introduced a privacy-preserving kNN query scheme based on asymmetric matrix encryption (AME), offering practicality and security, and Qi et al. [5] presented a privacy-aware kNN protocol under the dual cloud-server model, supporting dynamic data update operations. Despite these advancements, most methods do not consider the privacy of access patterns. To protect access patterns, works exploring the two (or more)-non-colluding cloud servers model have been pursued [22,23]. Elmehdwi et al. [22] designed a secure kNN query outsourcing scheme with Paillier’s homomorphic cryptosystem, achieving data, query, and result privacy at the cost of significant overhead on the QU side. Guan et al. [23] proposed a more efficient oblivious location-based kNN query scheme over the cloud, which is resistant to rainbow-location attacks.
Privacy-preserving kNN schemes with verifiability: Verifiable cloud-assisted kNN schemes focus on ensuring the correctness and integrity of query results obtained from the cloud. These schemes are crucial in scenarios where the cloud provider may be malicious or prone to errors. Compared with privacy-preserving designs for data in kNN queries, few works have focused on the verifiability of the query results. Yiu et al. [24] proposed a framework based on Voronoi MR trees, which is a variant of Merkle hash trees (MHT), to verify kNN query results and security zones. However, their design does not address data privacy, and most existing approaches relying on tree-based indexes leak access patterns. Moreover, Rong et al. [25] proposed the Verifiable Secure kNN (VSkNN) query scheme. However, their verifiable framework is probabilistic and they employed ASPE as the encryption scheme, which has previously been shown to compromise security [13] by exposing data and access pattern privacy to the cloud. Jiang et al. [26] proposed a verifiable dynamic searchable symmetric encryption scheme based on additive trees, but it only supports Boolean queries, not kNN queries. Therewith, Wu et al. [27] introduced ServeDB for secure and verifiable range queries, but it is not applicable to kNN queries, does not support access pattern privacy, and reveals data privacy in the verification results. For the first time, Cui et al. [9] proposed a secure and verifiable kNN query scheme called SVkNN, which uses Paillier’s additively homomorphic cryptosystem for privacy, Voronoi graphs for fast queries, and a hash-based message authentication code for verification. To address the efficiency issues identified in the work by [9], Liu et al. [3] proposed SecVKQ, a two-stage framework integrating edge servers to optimize query performance. Later, Cui et al. [8] further considered secure and verifiable kNN queries in multi-user settings and designed an effective scheme called MSVkNN, which employs a two-trapdoor public-key cryptosystem (DT-PKC) [28] for privacy and a hash-based message authentication code for verification. Recently, Zhang et al. [4] developed a secure kNN query scheme that integrates DT-PKC with random projection forests. This innovation not only enhances query efficiency but also ensures the privacy of data, queries, results, and access patterns while allowing for the verification of result correctness.
For transparency, we provide a comparison of the aforementioned works in terms of privacy, verifiability, the type of kNN employed, and the system models, as presented in Table 1.

1.2. Challenges and Contributions

As shown in Table 1, most existing schemes focus solely on data privacy, with only a few considering verifiability. Schemes that address both security properties simultaneously are relatively rare. Among these, most have one or more of the following limitations:
  • Inability to protect the privacy of data access patterns [3].
  • Use of probabilistic verification methods [7,25].
  • Verification is limited to confirming the authenticity of the data source for the query results returned by the cloud.
  • Support is limited to private verification, where the verification algorithm requires the involvement of a private key. This restricts verification to designated verifiers and leads to the inability to achieve accurate arbitration when disputes arise between the querier and the cloud server.
Consequently, designing a privacy-preserving, publicly verifiable scheme that simultaneously ensures data and access pattern privacy while achieving verifiable query results—including both the authenticity of the data source and the completeness of the results—remains a significant challenge.
Aiming at the above challenges, this paper builds on prior work to propose an efficient cloud-assisted kNN query scheme that guarantees data privacy while fulfilling the requirements of public verifiability. Specifically, compared to previous approaches, our contributions are notable in the following key aspects:
  • Formal Framework Establishment and Rigorous Theoretical Analysis: Although numerous privacy-preserving or verifiable kNN query schemes exist, to the best of our knowledge, no strict formal definition of privacy-preserving, publicly verifiable kNN queries has been proposed. Our first contribution is the establishment of the first formal framework for this purpose, clearly identifying its fundamental components and security definitions. Additionally, we conduct a rigorous and detailed theoretical analysis of our proposed scheme designed within this framework, laying a robust foundation for future research and practical advancements in the field.
  • Robust Privacy: By adopting the two-non-colluding cloud servers model and leveraging the semantic security of Paillier’s homomorphic cryptosystem along with the Voronoi diagram-based index structure, we design and integrate a series of secure computation algorithms. These include Secure Division Computation (SDC), Secure Grid Computation (SGC), Secure Nearest-Neighbor (SNN) search, and Secure Voronoi Cell Read (SCR). Together, these algorithms enable secure kNN queries, ensuring the privacy of the original dataset, query data, and query results while also preserving the privacy of data access patterns.
  • Public Verifiability: By integrating Paillier’s homomorphic cryptosystem with public-key signature techniques and a Voronoi diagram-based index structure, our scheme enables public verification of the authenticity and completeness of query results. Unlike previous designs, which restricted verification to designated verifiers, our approach removes this limitation, allowing for accurate arbitration in disputes between the querier and the cloud server.
  • Optimized Computational Efficiency: To ensure the correctness of encryption and decryption, the proposed scheme refines the parameters of a series of secure protocols integrated with data packing techniques, addressing the ambiguous and inaccurate descriptions in Cui et al.’s design [8]. Additionally, it optimizes the Voronoi diagram-based ciphertext index structure. These improvements result in significant computational and communication savings compared to prior schemes, thereby enhancing overall performance and contributing to scalability and practical applicability in real-world scenarios.

1.3. Layout of This Paper

The remainder of this paper is organized as follows. Section 2 describes the system model and the associated security definitions. Section 3 covers the necessary preliminaries. Our main computation outsourcing protocol is detailed in Section 4. In Section 5, we analyze the correctness and security of the proposed protocol. Section 6 presents a theoretical efficiency analysis and a practical performance evaluation. Finally, we summarize our findings in Section 7.

2. System Architecture, Threat Models, and Design Goals

2.1. System Architecture and Threat Models

Following the setup of prior verifiable designs, our kNN query system over a cloud dataset, as illustrated in Figure 1, involves four entities: the data owner (DO), the query user (QU), and two cloud servers (CS1 and CS2). The DO possesses a large-scale spatial dataset D = { D [ i ] = ( i , P i ) | i [ n ] } , and the QU issues a kNN query to the DO’s dataset. Due to limited storage and computing resources, the DO offloads the storage and associated computation tasks to the cloud servers while maintaining the privacy of the dataset. CS1 and CS2 are two non-colluding, resource-abundant, yet potentially untrusted cloud servers. To protect the privacy of the spatial dataset and support efficient kNN search operations, the DO must determine an appropriate blinding method and an efficient data structure to encrypt and organize the dataset D . Also, the design should ensure that the QU can verify the integrity and correctness of the cloud returned query result. Formally, the framework of the privacy-preserving publicly verifiable kNN query scheme, denoted as PVkNN = (Setup, DSEnc, QuEnc, Search, Verify, ResDec), consists of the following six algorithms:
  • { P K , S K } Setup ( 1 λ , n ) : Given a security parameter λ and dataset size n, this algorithm generates the public key P K and the secret key S K .
  • { ED } DSEnc ( D , P K , S K ) : On inputting the dataset D , the public key P K , and the secret key S K , the DO encrypts the dataset D into a ciphertext dataset ED = { E D [ i ] | i I } , where I denotes the label set.
  • { Q } QuEnc ( Q , P K ) : On inputting the query data point Q and the public key P K , this algorithms generates the encrypted query point Q .
  • { R , VO } Search ( Q , ED , P K ) : On inputting the encrypted query σ i , the encrypted dataset ED , and the public key P K , the search algorithm finds the query result R and the associated proof VO .
  • { δ } Verify ( Q , R , VO , P K ) : On inputting the encrypted query Q , the query result R , the proof VO , and the key pair ( P K , S K ) , the verification algorithm checks whether R is authentic and complete, producing an output δ { T r u e , F a l s e } .
  • { γ } ResDec ( Q , R , S K , δ ) : With the query Q, the query result R , the secret key S K and the indicator δ , if δ = T r u e , this algorithm decrypts the returned result R and recovers the set γ = R of k-nearest neighbors to Q. Otherwise, it rejects R and outputs γ = .
In this kNN query system, the security threat mainly comes from untrustworthy cloud servers or malicious queriers. In our design, we consider the following three threat behaviors: (1) the cloud servers are curious and may attempt to infer confidential information, such as the dataset, the query access pattern, and the query result; (2) the cloud servers are malicious and may deliberately tamper with the query result for financial incentives; and (3) the querier is malicious and may forge a false query result and attribute it to the cloud servers.
Remark 1.
It should be noted that the two-non-colluding cloud servers framework is reasonable in practice. For well-established cloud service providers such as Amazon AWS, Microsoft Azure, or Huawei Cloud, it is highly unlikely that any two of these companies would collude to damage their invaluable reputations. Also, this setup has been extensively employed in secure kNN query scenarios [3,4,5,7,8,9,22,25].

2.2. Design Goals

Considering our system architecture and threat models, we aim to design a secure and efficient kNN query protocol. Specifically, our design should meet the four requirements outlined below.

2.2.1. Correctness

Generally, correctness means that if the cloud servers perform the specified computation task, the design should ensure that an honest querier can finally obtain the k-nearest neighbors to his/her query data point. The formal definition is presented below.
Definition 1 (Correctness).
A PV k NN scheme is correct if, for any given security parameter λ, { P K , S K } Setup ( 1 λ , n ) , { ED } DSEnc ( D , P K , S K ) , { Q } QuEnc ( Q , P K ) , and { R , VO } Search ( Q , ED , P K ) are honestly performed, the probability
Pr [ { δ = T r u e } Verify ( Q , R , VO , P K ) { γ = R } ResDec ( Q , R , S K , δ ) ] 1 negl ( λ ) ,
where negl ( λ ) is a negligible function of λ.

2.2.2. Public Verifiability

In the context of verifiable kNN queries, verifiability refers to the ability to verify the authenticity and completeness of the query results returned by the cloud server. Specifically, this involves confirming whether the data in the query result was indeed uploaded by the DO and whether it corresponds precisely to the ciphertext of the k-nearest neighbors to the query point. Verification is considered private if it requires private keys, meaning that only the designated user who possesses the private keys can perform the verification [30,31]. Conversely, verification is public if it relies on public keys and can be performed by any entity [32,33,34]. Clearly, compared to private verifiability, public verifiability provides a more transparent verification mechanism. When disputes arise between the querier and the cloud server regarding the query results, public verifiability ensures the prompt resolution of such conflicts. Here, we define public verifiability using a private-key-independent verification algorithm called Verify, which guarantees that the probability of a cloud server successfully deceiving a verifier into accepting an incorrect result—either inauthentic or incomplete—is negligible. The formal definition is presented below.
Exp A PV [ , λ , n ] :
{ P K , S K } Setup ( 1 λ , n )
{ ED } DSEnc ( D , P K , S K )
For a = 0 to t 1 :
     Q a A ( P K , ED , Q 0 , , Q a 1 )
     Q a QuEnc ( Q a , P K )
     { R a , VO a } Search ( Q a , E D , P K ) ;
Q A ( P K , ED , ( Q a , Q a , R a , VO a ) 0 a t 1 ) ;
Q QuEnc ( Q , P K )
{ R , VO } A ( P K , ED , Q , Q , ( Q a , Q a , R a , VO a ) 0 a t 1 ) ;
{ δ } Verify ( Q , R , VO , P K ) ;
{ γ } ResDec ( Q , R , S K , δ ) ;
If δ = T r u e and γ R :
     output 1 ;
else
     output 0 ;
Definition 2 (Public Verifiability).
Let Π be a PV k NN protocol, and let A ( ) be a probabilistic polynomial-time (PPT) machine. We say that protocol Π is publicly verifiable if
A d v A PV Π , λ , n n e g l ( λ ) ,
where A d v A PV Π , λ , n = Pr Exp A PV Π , λ , n = 1 .

2.2.3. Privacy

Our design aims to protect the privacy of the following information: the dataset, the query data, the query result, and query access pattern. The following privacy aspects are considered:
  • Dataset privacy. Cloud servers cannot obtain any valuable information about the plaintext data in the dataset D .
  • Query data privacy. The QU’s query data should not be revealed to cloud servers.
  • Query result privacy. Apart from the QU, other participants cannot learn the plaintext query result.
  • Access pattern privacy. The identifiers corresponding to the k-nearest neighbors of the query point should not be revealed to the QU and the cloud servers (to prevent any inference attacks) [8,9,22].
In our analysis, we establish the privacy of our design by demonstrating the indistinguishability between the simulation model and the real model.

2.2.4. Efficiency

High efficiency is a generic requirement for any practical protocol. Under the premise of ensuring security, the design should reduce each participant’s computational and communication overheads as much as possible. Specifically, we should design an efficient index structure to support frequent query operations.

3. Preliminaries

3.1. Notations

For ease of description, we describe the frequently used notations in Table 2.

3.2. Permutation

Given a natural number m > 0 , mathematically, a permutation ρ is a bijection over the set { 0 , 1 , , m 1 } . Generally, it can be represented as
0 1 m 1 ρ ( 0 ) ρ ( 1 ) ρ ( m 1 ) ,
where ( ρ ( 0 ) , ρ ( 1 ) , , ρ ( m 1 ) ) is some rearrangement of ( 0 , 1 , , m 1 ) . For some m-tuple v = ( v 0 , v 1 , , v m 1 ) , we define ρ ( v ) = v ρ ( 0 ) , v ρ ( 1 ) , , v ρ ( m 1 ) .

3.3. Paillier’s Additively Homomorphic Cryptosystem

The well-known Paillier additively homomorphic cryptosystem [35] was proposed by Paillier in 1999. For some message m Z N , its encryption function E is denoted as c = E ( m , r ) = ( 1 + N ) m × r N mod N 2 , where N, which is public, is the product of two large prime numbers p and q, and r Z N is randomly chosen. The decryption function D with the secret key s k = λ ( N ) = L C M ( p 1 , q 1 ) is
m = c λ ( N ) mod N 2 1 N · λ ( N ) 1 mod N .
The Paillier cryptographic system exhibits the following properties:
  • Homomorphic addition: The decrypted product of two ciphertexts equals the sum of their corresponding plaintexts, and the decrypted kth power of a ciphertext equals the product of k and its corresponding plaintext.
    D E ( x 0 ) E ( x 1 ) mod N 2 = x 0 + x 1 mod N , D E ( x ) k mod N 2 = k x mod N .
  • Semantic security: If the decisional composite residuosity problem is hard, then the Paillier cryptosystem is CPA -secure.

3.4. Digital Signature Algorithm DSA

The Digital Signature Algorithm (DSA) is a standard algorithm for digital signatures and is a public-key cryptosystem based on the discrete logarithm problem [36]. For two large prime numbers p and q satisfying q p 1 , let g Z p be an element of order q, and y = g x mod p for some randomly chosen x Z q . Then, the signature key is s k = { x } , and the verification key is p k = { y } . For a message m, its signature is Sig ( m ) = ( r , s ) , where, for some randomly chosen k Z q ,
r = ( g k mod p ) mod q , s = k 1 ( H ( m ) + x r ) mod q ,
and H ( · ) is a public hash function. Given ( m , r , s ) , the verification algorithm { 0 , 1 } Ver ( m , r , s ) proceeds as follows: (1) Verify r , s Z q . (2) Calculate w = s 1 mod q , u 1 = w H ( m ) mod q , u 2 = w r mod q , and v = ( g u 1 y u 2 mod p ) mod q . (3) The algorithm outputs “1” (the signature is valid) if v = r . Otherwise, it outputs “0”.

3.5. Voronoi Diagram

Given a dataset D consisting of n points { P 0 , , P n 1 } in the plane, the Voronoi region V o r ( P i ) of the point P i D refers to the set of all points in the plane closer to P i than to any other point in D . Precisely,
V o r ( P i ) = { x R 2 | x P i x P j , j i } .
Equivalently, for any j i , let H i j = { x R 2 | x P i x P j } denote the half-plane, and we have V o r ( P i ) = j i H i j . The point P j is called a Voronoi-relevant vector of P i if and only if V o r ( P j ) and V o r ( P i ) share a boundary line. The set of these Voronoi-relevant vectors is generally denoted as V R V ( P i ) .
Lemma 1
([37]). Given a dataset D = { P 0 , , P n 1 } and a query point Q, the nearest neighbor of the query point Q is P if and only if Q V o r ( P ) .
Lemma 2
([38]). Let P 1 , P 2 , , P k be the k ( k 2 ) -nearest neighbors of a given query point Q. Then, P k V R V ( P 1 ) V R V ( P 2 ) V R V ( P k 1 ) .

3.6. Data Packing with Paillier’s Cryptosystem

For a large-scale dataset, to encrypt and decrypt each data point item by item is time-consuming. To improve encryption/decryption efficiency, Liu et al. [39] introduced a data-packing encryption/decryption technique. Here, we present their design with Paillier’s cryptosystem. For λ σ -bits integers x 1 , x 2 , , x λ with 2 σ λ < N , unlike traditional methods that encrypt them into E ( x 1 ) , E ( x 2 ) , , E ( x λ ) item by item, the data-packing technique encrypts them into a single ciphertext, which contains the following four algorithms:
  • Data packing x 1 | x 2 | | x λ Pack ( x 1 , , x λ ) :
    x 1 | x 2 | | x λ = i = 1 λ x i 2 σ ( λ i ) .
  • Data encryption c DataEnc ( x 1 | x 2 | | x λ ) :
    c = E ( x 1 | x 2 | | x λ ) = E ( i = 1 λ x i 2 σ ( λ i ) ) .
  • Data decryption x DataDec ( c ) :
    x = D ( c ) .
  • Data unpacking ( x 1 , x 2 , , x λ ) Unpack ( x ) : x λ = lsb σ ( x ) , x λ i = lsb ( i σ + 1 ) ( i + 1 ) σ ( x ) for i = 1 , , λ 1 , where lsb ( i σ + 1 ) ( i + 1 ) σ ( x ) denotes the binary bits between the ( i σ + 1 ) th position and the ( i + 1 ) σ th position of x counting from the least significant bit.
The correctness of the data decryption follows the fact that, since 2 λ σ < N ,
D ( c ) = D ( E ( i = 1 λ x i 2 σ ( λ i ) ) ) = i = 1 λ x i 2 σ ( λ i ) mod N = i = 1 λ x i 2 σ ( λ i ) = x .
The condition 2 λ σ < N is critical to guarantee the equivalence
i = 1 λ x i · 2 σ ( λ i ) mod N = i = 1 λ x i · 2 σ ( λ i ) ,
as it ensures that the summation remains strictly smaller than the modulus N, thereby avoiding modular reduction. Also, if we know the ciphertext E ( x i ) , we can get the ciphertext E ( x 1 | x 2 | | x λ ) by using the homomorphic property, i.e.,
E ( x 1 | | x λ ) = E ( i = 1 λ x i 2 σ ( λ i ) ) = i = 1 λ E ( x i ) 2 σ ( λ i ) mod N 2 .

4. Our Main Design

4.1. Design Intuition and Basic Idea

Inspired by [8,9], we adopt Paillier’s homomorphic cryptosystem to preserve the privacy of the dataset and the query data, as well as the “baby-step–giant-step” strategy to construct an efficient index data structure. Specifically, the spatial dataset is initially divided into several large rectangular regions, and each rectangle is further represented using a refined Voronoi diagram-based index. During the kNN search, we first perform a giant step to locate the target rectangular region, followed by a baby step to traverse the Voronoi diagram-based index and identify the individual points. To enable efficient search operations over the ciphertext dataset, we develop a series of secure protocols by integrating Paillier’s homomorphic encryption algorithm with the data-packing techniques, building upon previous designs. These protocols include the Secure Division Computation (SDC), Secure Grid Computation (SGC), Secure Nearest-Neighbor (SNN) search, and Secure Voronoi Cell Read (SCR) protocols, collectively enabling secure kNN queries. Since Paillier’s cryptosystem operates in a ring modulo a large integer and the data-packing technique is sensitive to parameter selection, we refine and tighten the previous designs to ensure correctness and optimize computational efficiency. This involves optimizing the index structure, improving parameter selection strategies, and addressing ambiguous or inaccurate descriptions. More importantly, previous designs lack support for public verifiability. In their approaches, the DO needs to send a secret verification key to the QU through a secure channel, or an additional Certified Authority (CA) is needed to generate and distribute keys through secure channels. To avoid the above-mentioned strong security assumption, as well as to prevent a malicious querier from forging a false query result and attributing it to the cloud servers, we integrate a public-key signature scheme with the Voronoi diagram-based index. This ensures both the authenticity and completeness of the query results while supporting public verifiability.

4.2. Our Main Outsourcing Protocol

4.2.1. DO Dataset Preprocessing Stage

This stage, carried out by the DO, is a one-time process. Assume that the spatial dataset D = { ( i , P i ) | i [ n ] } , where each element comprises an identifier ( i d ) i, and its corresponding point P i = ( x i , y i ) R 2 . With loss of generality, we bound x i [ 0 , X ] and y i [ 0 , Y ] for any i [ n ] , and max { X , Y , n 1 } < 2 σ . The aim of this stage is to construct two indexes ( I 1 and I 2 ) with the following steps:
  • Given a natural number m of appropriate size, the DO divides the region [ 0 , X ] × [ 0 , Y ] into m 2 small rectangular regions S s t = X m s , X m ( s + 1 ) × Y m t , Y m ( t + 1 ) for s , t [ m ] .
  • For each point P i , the DO finds its Voronoi-relevant vectors
    V R V ( P i ) = { ( i 1 , P i 1 ) , ( i 2 , P i 2 ) , , ( i i , P i i ) } .
  • The DO constructs the index I 1 = { G s t | s , t [ m ] } , where G s t = { ( s t 1 , P s t 1 ) , ( s t 2 , P s t 2 ) , , ( s t λ s t , P s t λ s t ) } refers to a subset of D and, for any point P s t u G s t , V o r ( P s t u ) S s t .
  • The DO constructs the index I 2 = { i , P i , V R V ( P i ) | i [ n ] } .

4.2.2. System Setup Stage: Setup

This stage is also performed by the DO and is a one-time task. Given a security parameter κ , the DO first invokes the key-generation algorithm in Paillier’s homomorphic cryptosystem to generate two random large prime numbers p and q with the same bit length κ and calculates N = p q , λ ( N ) = L C M ( p 1 , q 1 ) . Then, the DO generates the signature-verification key pair ( s i g k , v e r k ) for the DSA. Finally, the DO publishes the public key P K = { p k 1 = N , p k 2 = v e r k } and keeps the private key S K = { s k 1 = λ ( N ) , s k 2 = s i g k } secret.

4.2.3. Dataset Encryption Stage: DSEnc

This stage is performed by the DO and is a one-time task. As shown in the example illustrated in Figure 2, with the public key p k 1 for encryption and s k 2 for signature, the DO takes w = n and encrypts the dataset D into ED = ( E ( I 1 ) , E ( I 2 ) ) with
E ( I 1 ) = E G s t | s , t [ m ] , E ( I 2 ) = B j | j = 0 , 1 , , n w 1 .
Here, for E ( I 1 ) ,
E ( G s t ) = E i 1 ( s t ) | P i 1 ( s t ) | i 2 ( s t ) | P i 2 ( s t ) | | i λ ( s t ) | P i λ ( s t ) = E i 1 ( s t ) | x i 1 ( s t ) | y i 1 ( s t ) | i 2 ( s t ) | x i 2 ( s t ) | y i 2 ( s t ) | | i λ ( s t ) | x i λ ( s t ) | y i λ ( s t )
with P i u ( s t ) = ( x i u ( s t ) , y i u ( s t ) ) for u = 1 , , λ and λ = max 0 s , t m 1 { λ s t } . Specifically, if the number of points contained in some grid is less than λ , we can pad several random points chosen from D . For E ( I 2 ) ,
B j = { ( E ( j w ) , E ( V R V ( P j w ) ) , E ( S i g ( P j w ) ) ) , , ( E ( ( j + 1 ) w 1 ) , E ( V R V ( P ( j + 1 ) w 1 ) ) , E ( S i g ( P ( j + 1 ) w 1 ) ) ) } , B n w 1 = { ( E ( ( n w 1 ) w ) , E ( V R V ( P ( n w 1 ) w ) ) , E ( S i g ( P ( n w 1 ) w ) ) ) , , ( E ( ( n w ) w 1 ) , E ( V R V ( P ( n w ) w 1 ) ) , E ( S i g ( P ( n w ) w 1 ) ) ) } ,
are n w buckets, each bucket consisting of w three-tuples. Specifically, for the last bucket B n w 1 , we pad the subscripts n , n + 1 , , n w 1 with some random data points if it contains no more than w three-tuples. Moreover, for any i [ n ] , E ( V R V ( P i ) ) = E ( i 1 | P i 1 | i 2 | P i 2 | | i L | P i L ) , E ( S i g ( P i ) ) = E ( Sig ( P i | | P i 1 | | | | P i L ) ) . Similarly, if the number of points contained in some V R V ( P i ) is less than L = max { 0 , , n 1 } , we can pad several random points chosen from D . Finally, the DO uploads ( ED , E ( X / m ) , E ( Y / m ) ) to CS1 and the private key s k 1 to CS2.

4.2.4. QU Query Encryption Stage: QuEnc

This stage is performed by the QU. When the QU issues a kNN query request Q = ( x q , y q ) , he/she encrypts it into E ( Q ) = ( E ( x q ) , E ( y q ) ) with the public key p k 1 and then queries CS1 with E ( Q ) .

4.2.5. CS Search Stage: Search

This stage is cooperatively performed by the two non-colluding cloud servers CS1 and CS2 and is a query-based one-time operation. Algorithm 1 presents the details. In general, this stage proceeds as follows:
Algorithm 1 kNN ( E p k ( I ) , E p k ( X / m ) , E p k ( Y / m ) , E ( Q ) , p k 1 , s k 1 )
Input:  CS 1 : E p k ( I ) = ( E ( I 1 ) , E ( I 2 ) ) , E p k ( X / m ) , E p k ( Y / m ) ) , E ( Q ) , p k , CS2: the public-private key pair ( p k 1 , s k 1 )
Output: Result sets R 1 , R 2 , and verification sets VO 1 , VO 2
   1:
CS1 initializes two empty sets R 1 , VO 1 , CS2 initializes two empty sets R 2 , VO 2
   2:
CS1 and CS2 jointly perform Algorithm 2: ( E ( x q X / m ) , E ( y q Y / m ) ) SDC ( E ( Q ) , ( E ( X / m ) , E ( Y / m ) ) , p k 1 , s k 1 )
   3:
CS1 and CS2 jointly perform Algorithm 3: E ( G s ^ t ^ ) SGC( E ( I 1 ) , E ( x q X / m ) , E ( y q Y / m ) , p k 1 , s k 1 )
   4:
CS 1 and CS2 jointly perform Algorithm 4: ( E ( i d min ( 1 ) ) , E ( x i d min ( 1 ) ) , E ( y i d min ( 1 ) ) ) SNN( E ( G s ^ t ^ ) , E ( Q ) , , p k 1 , s k 1 )
   5:
CS1 generates three random number r 1 ( 1 ) , r 2 ( 1 ) , r 3 ( 1 ) Z N and updates R 1 R 1 { ( r 2 ( 1 ) , r 3 ( 1 ) ) }
   6:
CS1 calculates ( E ( i d min ( 1 ) ) , E ( x i d min ( 1 ) ) , E ( y i d min ( 1 ) ) ) ( E ( i d min ( 1 ) ) × E ( r 1 ( 1 ) ) , E ( x i d min ( 1 ) ) × E ( r 2 ( 1 ) ) , E ( y i d min ( 1 ) ) × E ( r 3 ( 1 ) ) ) and sends it to CS 2
   7:
CS 2 decrypts ( i d min ( 1 ) , x i d min ( 1 ) , y i d min ( 1 ) ) ( D ( E ( i d min ( 1 ) ) ) , D ( E ( x i d min ( 1 ) ) ) , D ( E ( y i d min ( 1 ) ) ) ) and updates R 2 R 2 { ( x i d min ( 1 ) , y i d min ( 1 ) ) }
   8:
CS 1 and CS2 jointly perform Algorithm 5: ( E ( V R V ( P i d min ( 1 ) ) , E ( S i g ( P i d min ( 1 ) ) ) ) SCR( E ( I 2 ) , E ( i d min ( 1 ) ) , p k 1 , s k 1 )
   9:
CS1 generates 3 L + 1 random numbers r 4 , 1 ( 1 ) , r 4 , 2 ( 1 ) , , r 4 , 3 L ( 1 ) , r 5 ( 1 ) Z N , packs r 4 , 1 ( 1 ) r 4 , 3 L ( 1 ) into r 0 ( 1 ) = r 4 , 1 ( 1 ) | r 4 , 2 ( 1 ) | | r 4 , 3 L ( 1 ) and updates VO 1 VO 1 { ( ( r 4 , 2 ( 1 ) , r 4 , 3 ( 1 ) ) , , ( r 4 , 3 k 1 ( 1 ) , r 4 , 3 k ( 1 ) ) , , ( r 4 , 3 L 1 ( 1 ) , r 4 , 3 L ( 1 ) ) , r 5 ( 1 ) ) }
  10:
CS1 calculates E ( V R V ( P i d min ( 1 ) ) ) E ( V R V ( P i d min ( 1 ) ) ) × E ( r 0 ( 1 ) ) , E ( S i g ( P i d min ( 1 ) ) ) E ( S i g ( P i d min ( 1 ) ) ) × E ( r 5 ( 1 ) ) and sends them to CS 2
  11:
CS 2 decrypts ( V R V ( P i d min ( 1 ) ) , S i g ( P i d min ( 1 ) ) ) ( D ( E ( V R V ( P i d min ( 1 ) ) ) ) , D ( E ( S i g ( P i d min ( 1 ) ) ) ) )
  12:
CS 2 unpacks { i d 1 ( 1 ) , P i d 1 ( 1 ) , , i d L ( 1 ) , P i d L ( 1 ) } Unpack ( V R V ( P i d min ( 1 ) ) ) , where P i d 1 ( 1 ) = ( x i d 1 ( 1 ) , y i d 1 ( 1 ) ) , , P i d L ( 1 ) = ( x i d L ( 1 ) , y i d L ( 1 ) ) .
  13:
CS 2 updates VO 2 VO 2 { ( P i d 1 ( 1 ) , , P i d L ( 1 ) , S i g ( P i d min ( 1 ) ) ) }
  14:
for  j = 2 to k do
  15:
    CS 1 and CS2 jointly perform Algorithm 4: ( E ( i d min ( j ) ) , E ( x i d m i n ( j ) ) , E ( y i d min ( j ) ) ) SNN ( { E ( V R V ( P i d min ( 1 ) ) ) , , E ( V R V ( P i d min ( j 1 ) ) } , E ( Q ) , { E ( i d min ( 1 ) ) , , E ( i d min ( j 1 ) ) } , p k 1 , s k 1 )
  16:
    CS 1 generate three random numbers r 1 ( j ) , r 2 ( j ) , r 3 ( j ) Z N and updates R 1 R 1 { ( r 2 ( j ) , r 3 ( j ) ) }
  17:
   CS1 calculates ( E ( i d min ( j ) ) , E ( x i d min ( j ) ) , E ( y i d min ( j ) ) ) ( E ( i d min ( j ) ) × E ( r 1 ( j ) ) , E ( x i d min ( j ) ) × E ( r 2 ( j ) ) , E ( y i d min ( j ) ) × E ( r 3 ( j ) ) ) and sends it to CS 2
  18:
    CS 2 decrypts ( i d min ( j ) , x i d min ( j ) , y i d min ( j ) ) ( D ( E ( i d min ( j ) ) ) , D ( E ( x i d min ( j ) ) ) , D ( E ( y i d min ( j ) ) ) ) and updates R 2 R 2 { ( x i d min ( j ) , y i d min ( j ) ) }
  19:
    CS 1 and CS 2 jointly perform Algorithm 5: ( E ( V R V ( P i d min ( j ) ) , E ( S i g ( P i d min ( j ) ) ) ) SCR( E ( I 2 ) , E ( i d min ( j ) ) , p k 1 , s k 1 )
  20:
    CS 1 generates 3 L + 1 random numbers r 4 , 1 ( j ) , r 4 , 2 ( j ) , , r 4 , 3 L ( j ) , r 5 ( j ) Z N , packs r 4 , 1 ( j ) r 4 , 3 L ( j ) into r 0 ( j ) = r 4 , 1 ( j ) | r 4 , 2 ( j ) | | r 4 , 3 L ( j ) and updates VO 1 VO 1 { ( r 4 , 2 ( j ) , r 4 , 3 ( j ) , , r 4 , 3 L ( j ) , r 5 ( j ) ) }
  21:
    CS 1 calculates E ( V R V ( P i d min ( j ) ) ) E ( V R V ( P i d min ( j ) ) ) × E ( r 0 ( j ) ) , E ( S i g ( P i d min ( j ) ) ) E ( S i g ( P i d min ( j ) ) ) × E ( r 5 ( j ) ) and sends them to CS 2
  22:
    CS 2 decrypts ( V R V ( P i d min ( j ) ) , S i g ( P i d min ( j ) ) ) ( D ( E ( V R V ( P i d min ( j ) ) ) ) , D ( E ( S i g ( P i d min ( j ) ) ) ) )
  23:
    CS 2 unpacks { i d 1 ( j ) , P i d j ( 1 ) , , i d L ( j ) , P i d L ( j ) } Unpack ( V R V ( P i d min ( j ) ) ) , where P i d 1 ( j ) = ( x i d 1 ( j ) , y i d 1 ( j ) ) , , P i d L ( j ) = ( x i d L ( j ) , y i d L ( j ) )
  24:
    CS 2 updates VO 2 VO 2 { ( P i d 1 ( j ) , , P i d L ( j ) , S i g ( P i d min ( j ) ) ) } .
  25:
CS 1 sends R 1 = { ( r 2 ( j ) , r 3 ( j ) ) | j = 1 , , k } and VO 1 = { ( ( r 4 , 2 ( j ) , r 4 , 3 ( j ) ) , , ( r 4 , 3 L 1 , r 4 , 3 L ( j ) ) , r 5 ( j ) ) | j = 1 , , k } to QU
  26:
CS 2 sends R 2 = { ( x i d min ( j ) , y i d min ( j ) ) | j = 1 , , k } and VO 2 = { ( P i d 1 ( j ) , , P i d L ( j ) , S i g ( P i d min ( j ) ) ) | j = 1 , , k } to QU
(1)
With the inputs E ( Q ) = ( E ( x q ) , E ( y q ) ) , ( E ( X / m ) , and E ( Y / m ) ) , and the public–private key pair ( p k 1 , s k 1 ) , cloud servers CS 1 and CS 2 interact with each other to calculate the ciphertexts E ( x q X / m ) and E ( y q Y / m ) using Algorithm 2.
(2)
With the inputs E ( I 1 ) , E ( x q X / m ) , E ( y q Y / m ) , p k 1 , and s k 1 , cloud servers CS 1 and CS 2 collaborate to execute Algorithm 3, traversing E ( I 1 ) to locate the target grid E ( G s ^ t ^ ) with s ^ = x q X / m , t ^ = y q Y / m , which means that Q G s ^ t ^ . The core idea of this algorithm is to determine ( s ^ , t ^ ) based on verifying whether both E ( x q X / m ) × ( E ( s ) ) N 1 = E ( x q X / m s ) and E ( y q Y / m ) × ( E ( t ) ) N 1 = E ( y q Y / m t ) represent ciphertexts of 0 for some s , t 0 , , m 1 . Note that x q X / m s may be negative; the parameter T ensures that E ( x q X / m ) × ( E ( s ) ) N 1 × E ( T ) = E ( x q X / m s + T ) ensures that the results represents a ciphertext of a positive number.
Algorithm 2 SDC ( E ( Q ) , ( E ( X / m ) , E ( Y / m ) ) , p k 1 , s k 1 )
Input:  CS 1 : E ( Q ) = ( E ( x q ) , E ( y q ) ) , ( E ( X / m ) , E ( Y / m ) ) and a security parameter τ < κ σ 2 1 , CS 2 : p k 1 , s k 1 .
Output:  CS 1 obtains the quotients E ( x q X / m ) and E ( y q Y / m )
   1:
CS 1 generates two random numbers r 1 , r 2 Z N with τ bits
   2:
CS 1 calculates x E ( x q ) r 1 × E ( X / m ) r 1 r 2 , u E ( X / m ) r 1 and y E ( y q ) r 1 × E ( Y / m ) r 1 r 2 , v E ( Y / m ) r 1
   3:
CS 1 sends ( x , y ) and ( u , v ) to CS 2
   4:
CS 2 decrypts ( x , y ) ( D ( x ) , D ( y ) ) and ( u , v ) ( D ( u ) , D ( v ) )
   5:
CS 2 calculates the quotients d x = x u and d y = y v
   6:
CS 2 encrypts d x E ( d x ) and d y E ( d y ) with p k
   7:
CS 2 sends ( d x , d y ) to CS 1
   8:
CS 1 calculates the quotients E ( x q X / m ) = d x × E ( r 2 ) N 1 and E ( y q Y / m ) = d y × E ( r 2 ) N 1
Algorithm 3 SGC ( E ( I 1 ) , E ( x q X / m ) , E ( y q Y / m ) , p k 1 , s k 1 )
Input:  CS 1 : E ( I 1 ) = { E ( G s t ) | s , t [ m ] } , E ( x q X / m ) , E ( y q Y / m ) , p k 1 and security parameters τ , σ satisfying 2 τ + σ < 2 σ , 2 σ m < N , CS 2 : p k 1 , s k 1 , m
Output: the grid E ( G s ^ t ^ ) with s ^ = x q X / m and t ^ = y q Y / m
   1:  Δ x , Δ y , Γ , M
   2: for  j = 0 to m 1  do
   3:     CS 1 generates random numbers r x j , r y j Z N with τ bits and a random number T m
   4:      CS 1 calculates
Δ x j E ( x q X / m ) × E ( j ) N 1 × E ( T ) r x j , Δ y j E ( y q Y / m ) × E ( j ) N 1 × E ( T ) r y j
   5:  Δ x ( Δ x 0 , Δ x 1 , , Δ x ( m 1 ) ) , Δ y ( Δ y 0 , Δ y 1 , , Δ y ( m 1 ) )
   6: for  s = 0 to m 1  do
   7:         for  t = 0 to m 1  do
   8:                   CS 1 generates a random numbers r s t and calculates E ( r s t )
   9:                   CS 1 computes E ( G s t ) E ( G s t ) × E ( r s t )
  10:  E ( G ) ( E ( G s t ) ) 0 s , t m 1
  11:  CS 1 permutes the vectors Δ x , Δ y and grid E ( G ) with two random permutations ρ 1 , ρ 2 :
  12:  Δ x = ρ 1 ( Δ x ) = ( Δ x ρ 1 ( 0 ) , Δ x ρ 1 ( 1 ) , , Δ x ρ 1 ( m 1 ) ) , Δ y = ρ 2 ( Δ y ) = ( Δ y ρ 2 ( 0 ) , Δ y ρ 2 ( 1 ) , , Δ y ρ 2 ( m 1 ) ) ,
  13:  Γ = ρ 2 ( ρ 1 ( E ( G ) ) ) = ( E ( G ρ 1 ( s ) ρ 2 ( t ) ) ) 0 s , t m 1
  14:  CS 1 packs Δ x , Δ y , i.e., calculates v x i = 0 m 1 Δ x ρ 1 ( i ) 2 σ ( m ( i + 1 ) ) , v y i = 0 m 1 Δ y ρ 2 ( i ) 2 σ ( m ( i + 1 ) ) and sends v x , v y , Γ , T to CS 2
  15:  CS 2 decrypts and unpacks v x and v y : ( v x ρ 1 ( 0 ) , , v x ρ 1 ( m 1 ) ) D ( v x ) , ( v y ρ 2 ( 0 ) , , v y ρ 2 ( m 1 ) ) D ( v y )
  16: for  s = 0 to m 1  do
  17:         for  t = 0 to m 1  do
  18:                 if  v x ρ 1 ( s ) mod T = 0 and v y ρ 2 ( t ) mod T = 0
  19: 
  20:                       Γ Γ ρ 1 ( s ) ρ 2 ( t ) = E G ρ 1 ( s ) ρ 2 ( t ) , M s t E ( 1 )
  21: 
  22:                   else  M s t E ( 0 )
  23:  CS 2 sends Γ and M = M s t 0 s , t m 1 to CS 1
  24:  CS 1 permutes the matrix M with ρ 1 , ρ 2 : M ρ 1 1 ( ρ 2 1 ( M ) )
  25:  CS 1 calculates E ( r ) s = 0 m 1 t = 0 m 1 ( M s t ) r s t
  26:  CS 1 gets the target grid E ( G s ^ t ^ ) containing the query point: E ( G s ^ t ^ ) Γ × E ( r ) N 1
(3)
After identifying the correct grid E ( G s ^ t ^ ) , cloud servers CS1 and CS2 jointly execute Algorithm 4 to search within the grid G s ^ t ^ in ciphertext form and locate the ciphertext ( E ( i d ( 1 ) m i n ) , E ( x i d m i n ( 1 ) ) , E ( y i d m i n ( 1 ) ) ) corresponding to the nearest neighbor to Q, where ( i d m i n ( 1 ) , P i d m i n ( 1 ) = ( x i d m i n ( 1 ) , y i d m i n ( 1 ) ) ) G s ^ t ^ . It is worth noting that Algorithm 4 can trivially be adapted to handle scenarios where the input includes multiple packed ciphertext datasets and multiple ciphertext i d values. Specifically, the input format is ( { E ( G ( 1 ) ) , , E ( G ( α ) ) } , E ( Q ) , { E ( i d 1 ) , E ( i d 2 ) , , E ( i d β ) } , p k 1 , s k 1 ) , and the output is the ciphertext ( E ( i d m i n ) , E ( x i d m i n ) , E ( y i d m i n ) ) , where P i d m i n = ( x i d m i n , y i d m i n ) G ( 1 ) G ( α ) represents the nearest-neighbor point to Q, with the constraint that i d min { i d 1 , i d 2 , , i d β } .
Algorithm 4 SNN ( E ( G s ^ t ^ ) , E ( Q ) , E ( i d ) , p k 1 , s k 1 )
Input:  CS 1 : E ( Q ) = ( E ( x q ) , E ( y q ) ) , E ( G s ^ t ^ ) and E ( i d ) , CS 2 : the private key s k 1
Output: ( E ( i d min ) , E ( x i d min ) , E ( y i d min ) ): the ciphertext of the nearest neighbor ( i d min , P i d min = ( x i d min , y i d min ) ) G s ^ t ^ to Q with i d min i d
   1:
CS 1 generates 3 λ + 1 random numbers r 0 , r 1 , , r 3 λ Z N
   2:
CS 1 packs r 1 r 3 λ into Φ 0 = r 1 | r 2 | | r 3 λ , packs r 2 , r 3 , r 5 , r 6 , , r 3 λ 1 , r 3 λ into Φ 1 = r 2 | r 3 | | r 3 λ 1 | r 3 λ and packs r 1 , r 4 , r 7 , , r 3 λ 2 into Φ 2 = r 1 | r 4 | | r 3 λ 2
   3:
CS 1 calculates v 0 ( E ( G s ^ t ^ ) × E ( Φ 0 ) ) r 0
   4:
CS 1 calculates E ( Q ) E ( x q | y q | x q | y q | | x q | y q ) = E ( x q ) 2 σ ( 2 λ 1 ) E ( y q ) 2 σ ( 2 λ 2 ) E ( x q ) 2 σ E ( y q ) 2 0
   5:
CS 1 calculates v 1 ( E ( Q ) × E ( Φ 1 ) ) r 0
   6:
CS 1 calculates E ( i d ) E ( i d | i d | i d | i d | | i d | i d ) = E ( i d ) 2 σ ( λ 1 ) E ( i d ) 2 σ ( λ 2 ) E ( i d ) 2 σ E ( i d ) 2 0
   7:
CS 1 calculates v 2 ( E ( i d ) × E ( Φ 2 ) ) r 0
   8:
CS 1 sends v 0 , v 1 , v 2 to CS 2
   9:
CS 2 decrypts v 0 D ( v 0 ) , v 1 D ( v 1 ) and v 2 D ( v 2 ) with
  10:
    v 0 = r 0 ( i 1 ( s t ) + r 1 ) 2 σ ( 3 λ 1 ) + r 0 ( x i 1 ( s t ) + r 2 ) 2 σ ( 3 λ 2 ) + r 0 ( y i 1 ( s t ) + r 3 ) 2 σ ( 3 λ 3 ) + + r 0 ( y i λ ( s t ) + r 3 λ ) 2 0 ,
  11:
    v 1 = r 0 ( r 2 + x q ) 2 σ ( 2 λ 1 ) + r 0 ( r 3 + y q ) 2 σ ( 2 λ 2 ) + + r 0 ( y q + r 3 λ ) 2 0
  12:
    v 2 = r 0 ( i d + r 1 ) 2 σ ( λ 1 ) + r 0 ( i d + r 4 ) 2 σ ( λ 2 ) + + r 0 ( i d + r 3 λ 2 ) 2 0
  13:
CS 2 unpacks v 0 , v 1 , v 2 to get (ID, P), Q and id i.e.,
  14:
    ID = r 0 ( i 1 ( s t ) + r 1 ) , r 0 ( i 2 ( s t ) + r 4 ) , r 0 ( i 3 ( s t ) + r 7 ) , , r 0 ( i λ ( s t ) + r 3 λ 2 ) ,
  15:
    P = r 0 ( x i 1 ( s t ) + r 2 ) , r 0 ( y i 1 ( s t ) + r 3 ) , r 0 ( x i 2 ( s t ) + r 5 ) , r 0 ( y i 2 ( s t ) + r 6 ) , ,
  16:
             r 0 ( x i λ ( s t ) + r 3 λ 1 ) , r 0 ( y i λ ( s t ) + r 3 λ )
  17:
    Q = r 0 ( x q + r 2 ) , r 0 ( y q + r 3 ) , r 0 ( x q + r 5 ) , r 0 ( y q + r 6 ) , , r 0 ( x q + r 3 λ 1 ) , r 0 ( y q + r 3 λ )
  18:
    id = r 0 ( i d + r 1 ) , r 0 ( i d + r 4 ) , r 0 ( i d + r 7 ) , , r 0 ( i d + r 3 λ 2 )
  19:
d m i n + , d [ ] , δ [ ] , p o s
  20:
for  = 0 to λ 1  do
  21:
     d [ ] ( P [ ] . x Q [ ] . x ) 2 + ( P [ ] . y Q [ ] . y ) 2
  22:
    if  d [ ] < d min and ID [ ] id [ ]  then
  23:
  24:
              p o s
  25:
              d min d [ p o s ]
  26:
δ [ p o s ] E ( 1 ) and j p o s , δ [ j ] E ( 0 )
  27:
CS 2 sends δ , E ( ID [ p o s ] ) , E ( P [ p o s ] . x ) , E ( P [ p o s ] . y ) to CS 1
  28:
CS 1 calculates E ( r i d ) i = 0 λ 1 δ [ i ] r 3 i + 1 , E ( r x ) i = 0 λ 1 δ [ i ] r 3 i + 2 and E ( r y ) i = 0 λ 1 δ [ i ] r 3 i + 3
  29:
CS 1 calculates E ( i d m i n ) E ( ID [ p o s ] ) r 0 1 × E ( r i d ) N 1 , E ( x i d m i n ) E ( P [ p o s ] . x ) r 0 1 × E ( r x ) N 1 ,   E ( y i d m i n ) E ( P [ p o s ] . y ) r 0 1 × E ( r y ) N 1
(4)
According to Lemma 2, the second-nearest neighbor to Q resides in the set V R V ( P i d min ( 1 ) ) . Therefore, leveraging E ( i d m i n ) , cloud servers CS 1 and CS 2 collaboratively execute Algorithm 5 to explore E ( I 2 ) and identify ( E ( V R V ( P i d min ( 1 ) ) ) , S i g ( P i d min ( 1 ) ) ) . These results facilitate the discovery of the second-nearest neighbor and verification of the correctness of P i d m i n ( 1 ) .
Algorithm 5 SCR ( E ( I 2 ) , E ( i d m i n ) , p k 1 , s k 1 )
Input:  CS 1 : E ( I 2 ) , E ( i d m i n ) and p k 1 . CS 2 : the public-private key pair ( p k 1 , s k 1 )
Output:  E ( V R V ( P i d min ) ) and E ( S i g ( P i d min ) ) in E ( I 2 )
   1:
for each B j I 2 ( i . e . , j = 0 to n w 1 ) do
   2:
     CS 1 generates random numbers r 0 j , r 1 j Z N with τ bits
   3:
     CS 1 calculates E ( η M j ) ( E ( ( j + 1 ) w 1 ) × E ( r 0 j ) ) r 1 j , E ( η j ) ( E ( i d m i n ) × E ( r 0 j ) ) r 1 j , E ( η m j ) ( E ( j w ) × E ( r 0 j ) ) r 1 j
   4:
    for  k = 0 to w 1  do
   5:
         CS 1 generates three packing random numbers r ( j w + k ) 0 , r ( j w + k ) 1 , r ( j w + k ) 2 Z N
   6:
         CS 1 calculates Ψ ( j w + k ) 0 = ( E ( j w + k ) × E ( i d min ) N 1 ) r ( j w + k ) 0 ,
   7:
         Ψ ( j w + k ) 1 = E ( V R V ( P j w + k ) ) × E ( r ( j w + k ) 1 ) , Ψ ( j w + k ) 2 = E ( S i g ( P j w + k ) ) × E ( r ( j w + k ) 2 )
   8:
     Ψ j 0 ( Ψ ( j w + k ) 0 ) 0 k w 1 , Ψ j 1 ( Ψ ( j w + k ) 1 ) 0 k w 1 , Ψ j 2 ( Ψ ( j w + k ) 2 ) 0 k w 1
   9:
CS 1 calculates η M ( E ( η M 0 ) , , E ( η M ( n w 1 ) ) ) , η m ( E ( η m 0 ) , , E ( η m ( n w 1 ) ) ) , η ( E ( η 0 ) , , E ( η n w 1 ) )
  10:
CS 1 calculates Ψ 0 ( Ψ 00 , , Ψ n w 1 , 0 ) = Ψ ( j w + k ) 0 0 j ( n w 1 ) , 0 k w 1 ,
  11:
     Ψ 1 ( Ψ 01 , , Ψ n w 1 , 1 ) = Ψ ( j w + k ) 1 0 j ( n w 1 ) , 0 k w 1 ,
  12:
     Ψ 2 ( Ψ 02 , , Ψ n w 1 , 1 ) = Ψ ( j w + k ) 2 0 j ( n w 1 ) , 0 k w 1 , B Ψ 0 , Ψ 1 , Ψ 2
  13:
CS 1 permutes η M , η m , η and buckets B with two random permutations ρ 1 , ρ 2 :
  14:
η M ρ 1 ( η M ) , η m ρ 1 ( η m ) , η = ρ 1 ( η ) ,
  15:
B ρ 2 ( ρ 1 ( B ) ) = ( Ψ 0 , Ψ 1 , Ψ 2 ) = ( ( Ψ ρ 1 ( 0 ) 0 , , Ψ ρ 1 ( n w 1 ) , 0 ) , ( Ψ ρ 1 ( 0 ) 1 , , Ψ ρ 1 ( n w 1 ) , 1 ) ,
  16:
( Ψ ρ 1 ( 0 ) 2 , , Ψ ρ 1 ( n w 1 ) , 2 ) ) = ( ( Ψ ( ρ 1 ( j ) w + ρ 2 ( k ) ) 0 ) 0 j ( n w 1 ) , 0 k w 1 ,
  17:
( Ψ ( ρ 1 ( j ) w + ρ 2 ( k ) ) 1 ) 0 j ( n w 1 ) , 0 k w 1 , ( Ψ ( ρ 1 ( j ) w + ρ 2 ( k ) ) 2 ) 0 j ( n w 1 ) , 0 k w 1 )
  18:
CS 1 sends η M , η m , η , B to CS 2
  19:
CS 2 decrypts η M , η m , η : ( η M ρ 1 ( 0 ) , , η M ρ 1 ( n w 1 ) ) D ( η M ) ,
  20:
( η m ρ 1 ( 0 ) , , η m ρ 1 ( n w 1 ) ) D ( η m ) , ( η ρ 1 ( 0 ) , , η ρ 1 ( n w 1 ) ) D ( η )
  21:
for  j = 0 to n w 1  do
  22:
    if  ( η M ρ 1 ( j ) η ρ 1 ( j ) ) 0 and ( η m ρ 1 ( j ) η ρ 1 ( j ) ) < 0  then
  23:
          CS 2 decrypts Ψ ρ 1 ( j ) 0 : ( ψ ( ρ 1 ( j ) w + ρ 2 ( 0 ) ) 0 , , ψ ( ρ 1 ( j ) w + ρ 2 ( w 1 ) ) 0 ) D ( Ψ ρ 1 ( j ) 0 )
  24:
         for  k = 0 to w 1  do
  25:
             if  ψ ( ρ 1 ( j ) w + ρ 2 ( k ) ) 0 = 0  then
  26:
                   Θ Ψ ( ρ 1 ( j ) w + ρ 2 ( k ) ) 1 , Ψ ( ρ 1 ( j ) w + ρ 2 ( k ) ) 2 , M j k E ( 1 )
  27:
             else  M j k E ( 0 )
  28:
CS 2 sends Θ , M to CS 1
  29:
CS 1 permutes the matrix M with ρ 1 , ρ 2 : M ρ 1 1 ( ρ 2 1 ( M ) )
  30:
CS 1 calculates E ( ψ ( 1 ) ) j = 0 n w 1 k = 0 w 1 ( M j k ) r ( j w + k ) 1 and E ( ψ ( 2 ) ) j = 0 n w 1 k = 0 w 1 ( M j k ) r ( j w + k ) 2
  31:
CS 1 calculates E ( V R V ( P i d min ) ) Θ 1 × E ( ψ ( 1 ) ) N 1 and E ( S i g ( P i d min ) ) Θ 2 × E ( ψ ( 2 ) ) N 1
(5)
Similarly, the jth ( j 2 ) nearest neighbor to Q is in the set V R V ( P i d min ( 1 ) ) V R V ( P i d min ( 2 ) ) V R V ( P i d min ( j 1 ) ) . Thus, for j = 2 to k, cloud servers CS 1 and CS 2 recursively perform the following operations. First, they jointly invoke Algorithm 4 to traverse E ( V R V ( P i d min ( 1 ) ) ) E ( V R V ( P i d min ( j 1 ) ) ) and find the ciphertext ( E ( i d m i n ( j ) ) , E ( x i d m i n ( j ) ) , E ( y i d m i n ( j ) ) ) of the jth nearest neighbor to Q. Then, they jointly perform Algorithm 5 to search E ( I 2 ) in the ciphertext form and find ( E ( V R V ( P i d min ( j ) ) ) , S i g ( P i d min ( j ) ) ) , which is used to find the ( j + 1 ) th nearest neighbor and verify the correctness of P i d min ( j ) = ( x i d m i n ( j ) , y i d m i n ( j ) ) .
(6)
Through the above five steps, CS 1 can obtain the encrypted query result and the encryption verification information
E ( i d min ( j ) ) , E ( x i d min ( j ) ) , E ( y i d min ( j ) ) | j = 1 , , k , E ( V R V ( P i d min ( j ) ) ) , S i g ( P i d min ( j ) ) | j = 1 , , k .
However, without the secret key s k 1 , the QU cannot recover the plaintext of the query result and the verification information. Therefore, CS 1 must leverage the private key stored in CS 2 to assist the QU in obtaining the plaintext. To achieve this, through the homomorphic property, CS 1 adds some random numbers to blind the encrypted query result and the verification information and sends them to CS 2 . Finally, CS 1 returns the sets R 1 and VO 1 of random numbers to the QU, while CS 2 decrypts the blinded results sent from CS 1 and returns the decrypted sets R 2 and VO 2 to the QU.

4.2.6. QU Verification and Decryption Stage: Verify and ResDec

After receiving the R 1 , VO 1 from CS1 and R 2 , VO 2 from CS2, the QU performs Algorithm 6 to verify the correctness of the query result. If the verification algorithm returns T r u e , the QU treats the set R in Equation (2) as the final query result. Otherwise, the QU rejects the result.
Algorithm 6 Verify ( R 1 , VO 1 , R 2 , VO 2 , p k 2 )
Input:  R 1 , VO 1 , R 2 , VO 2 .
Output:  T r u e or F a l s e
   1:
for  j = 1 to k
   2:
    Calculate the difference between the jth element in R 2 and that in R 1 and let
R = x i d min ( j ) , y i d min ( j ) ( r 2 ( j ) , r 3 ( j ) ) | j = 1 , , k = P i d min ( j ) = x i d min ( j ) , y i d min ( j ) | j = 1 , , k
   3:
    Calculate the difference between the jth element in VO 2 and that in VO 1 and let
VO = P i d 1 ( j ) , , P i d L ( j ) , S i g ( P i d min ( j ) ) ( r 4 , 2 ( j ) , r 4 , 3 ( j ) ) , , ( r 4 , 3 L 1 ( j ) , r 4 , 3 L ( j ) ) , r 5 ( j ) | j = 1 , , k = P i d 1 ( j ) , , P i d L ( j ) , S i g ( P i d min ( j ) ) | j = 1 , , k
   4:
    For the message P i d min ( j ) | | P i d 1 ( j ) | | | | P i d L ( j ) and its signature S i g ( P i d min ( j ) ) = Sig ( P i d min ( j ) | | P i d 1 ( j ) | | | | P i d L ( j ) ) , invoke the verification algorithm Ver of DSA to check the signature.
   5:
         if  { 0 } Ver ( P i d min ( j ) | | P i d 1 ( j ) | | | | P i d L ( j ) , S i g ( P i d min ( j ) ) ) , return  F a l s e
   6:
For the point P i d min ( 1 ) , calculate the distance d i s t ( P i d min ( 1 ) , Q ) 2 = ( x i d min ( 1 ) x q ) 2 + ( y i d min ( 1 ) y q ) 2
   7:
Calculate MIN 1 = min { d i s t ( P i d 1 ( 1 ) , Q ) 2 , , d i s t ( P i d L ( 1 ) , Q ) 2 }
   8:
if  d i s t ( P i d min ( 1 ) , Q ) 2 > MIN 1 , return  F a l s e
   9:
for  j = 2 to k
  10:
    Initialize MIN j +
  11:
    for  v = 1 to j 1
  12:
        for  u = 1 to L
  13:
            if  P i d u ( v ) P i d min ( 1 ) , , P i d min ( j 1 ) and d i s t ( P i d u ( v ) , Q ) 2 < MIN j  then
  14:
                   MIN j d i s t ( P u , i d min ( j ) , Q ) 2
  15:
    if  d i s t ( P i d min ( j ) , Q ) 2 > MIN j , return  F a l s e
  16:
return  T r u e

5. Correctness and Security Analysis

5.1. Correctness Analysis

In this section, we analyze the correctness of our proposed protocol. First, we prove the correctness of each algorithm.
Lemma 3.
Algorithm 2 is correct. That is, for any valid input E ( Q ) = ( E ( x q ) , E ( y q ) ) , ( E ( X / m ) , E ( Y / m ) ) , and s k 1 ,   CS 1 can indeed obtain the ciphertexts E ( x q X / m ) and E ( y q Y / m ) .
Proof. 
See Appendix B.1. □
Lemma 4.
Algorithm 3 is correct. That is, for any valid input E ( G s t ) , E ( x q X / m ) , E ( y q Y / m ) , p k 1 , and s k 1 ,   CS 1 can indeed obtain the target grid E ( G s t ) with ( s , t ) = ( x q X / m , y q Y / m ) .
Proof. 
See Appendix B.2. □
Lemma 5.
Algorithm 4 is correct. That is, for any valid input E ( Q ) = ( E ( x q ) , E ( y q ) ) , E ( G s t ) , p k 1 , and s k 1 ,   CS 1 can indeed obtain the ciphertext ( E ( i d m i n ) , E ( x i d m i n ) , E ( y i d m i n ) ) such that i d min i d , and P i d m i n = ( x i d m i n , y i d m i n ) G s t is the nearest-neighbor point to Q with i d min i d .
Proof. 
See Appendix B.3. □
Lemma 6.
Algorithm 5 is correct. That is, for any valid input E ( I 2 ) , E ( i d m i n ) , and s k ,   CS 1 can indeed obtain E ( V R V ( P i d min ) ) and S i g ( P i d min ) in E ( I 2 ) .
Proof. 
See Appendix B.4. □
Lemma 7.
If   CS 1 and   CS 2 faithfully execute Algorithm 1, then, for any 1 j k , the difference between the jth element in R 2 (resp. VO 2 ) and that in R 1 (resp. VO 1 ) satisfies
x i d min ( j ) , y i d min ( j ) ( r 2 ( j ) , r 3 ( j ) ) = x i d min ( j ) , y i d min ( j ) ,
r e s p . P i d 1 ( j ) , , P i d L ( j ) , S i g ( P i d min ( j ) ) ( r 4 , 2 ( j ) , r 4 , 3 ( j ) ) , , ( r 4 , 3 L 1 ( j ) , r 4 , 3 L ( j ) ) , r 5 ( j ) = P i d 1 ( j ) , , P i d L ( j ) , S i g ( P i d min ( j ) ) ,
where P i d min ( j ) = ( x i d min ( j ) , y i d min ( j ) ) is exactly the jth nearest neighbor to Q, { P i d 1 ( j ) , , P i d L ( j ) } is the set of Voronoi-relevant vectors of P i d min ( j ) , and S i g ( P i d min ( j ) ) ) is the signature of P i d min ( j ) .
Building on the foundation provided by the aforementioned lemmas, we are able to establish the correctness of our protocol.
Theorem 1.
According to Definition 1, our protocol is correct. That is, if the cloud servers are honest, an honest QU can get the exact k-nearest neighbors to the query point Q.
Proof. 
As outlined in Definition 1, establishing correctness merely requires demonstrating that Algorithm 6 indeed returns T r u e and that P i d min ( j ) = ( x i d min ( j ) , y i d min ( j ) ) R is exactly the jth nearest neighbor to Q. In fact, by Lemma 7, the set R (resp. VO ) in Algorithm 6 is
R = { ( x i d min ( j ) , y i d min ( j ) ) | j = 1 , , k } , ( r e s p . VO = { ( P i d 1 ( j ) , , P i d L ( j ) , S i g ( P i d min ( j ) ) ) | j = 1 , , k } ) ,
where P i d min ( j ) = ( x i d min ( j ) , y i d min ( j ) ) is indeed the jth nearest neighbor to Q, { P i d 1 ( j ) , , P i d L ( j ) } is the set of Voronoi-relevant vectors of P i d min ( j ) , and S i g ( P i d min ( j ) ) is the signature of P i d min ( j ) . Therefore, in Step 5, the signature S i g ( P i d min ( j ) ) of the message P i d min ( j ) | | P i d 1 ( j ) | | | | P i d L ( j ) will pass the verification algorithm. Moreover, in conjunction with Lemma 2, we know that, after the for-loop in Step 9, P i d min ( j ) = ( x i d min ( j ) , y i d min ( j ) ) will pass the distance verification. That is, Algorithm 6 will return T r u e and R is correct. □

5.2. Public Verifiability

Theorem 2.
If the hash functions H ( · ) and the signature algorithm are secure, then our protocol achieves public verifiability, as defined in Definition 2. That is, the advantage probability
A d v A PV Π , λ , n = Pr Exp A PV Π , λ , n = 1
that the adversary A obtains in the experiment Exp A PV [ Π , λ , n ] is negligible.
Proof. 
According to Definition 2, we only need to analyze the probability of the event that the experiment Exp A PV [ Π , λ , n ] outputs 1, which means that { T r u e } Verify ( Q , R , VO , P K ) in Algorithm 6 and { γ R } ResDec ( Q , R , S K , δ ) .
In essence, the event { T r u e } Verify ( Q , R , VO , P K ) entails two conditions:
(1)
In Step 4, S i g ( P i d min ( j ) ) is the correct signature of the message P i d min ( j ) | | P i d 1 ( j ) | | | | P i d L ( j ) .
(2)
In Step 15, d i s t ( P i d min ( j ) , Q ) M I N j .
Furthermore, condition (1) implies that P i d min ( j ) , P i d 1 ( j ) , , P i d L ( j ) are the data points from the DO and { P i d 1 ( j ) , , P i d L ( j ) } are the Voronoi-relevant vectors of P i d min ( j ) . Condition (2) implies that P i d min ( j ) is the jth nearest neighbor of Q. Consequently, the output of the decryption algorithm yields γ = R . In simpler terms, A d v A PV Π , λ , n = 0 . □

5.3. Privacy

For privacy analysis, we leverage the formal definition of multiparty computation, as outlined in [39,40], within the framework of the simulation paradigm [41]. The overarching concept is described below.
Theorem 3.
(Composition Theorem [41]) Given a protocol Ω composed of several sub-protocols, if all sub-protocols are secure and all intermediate results are either random or pseudorandom, then the protocol Ω is secure.
In the simulation paradigm, it is essential that the perspective of each participating party in a protocol can be replicated based solely on its input and output. This requirement ensures that parties do not gain any additional information from the protocol beyond what their inputs and outputs imply. In other words, the simulated view of each sub-protocol must be computationally indistinguishable from the actual execution view. To illustrate this concept, we formally demonstrate the security of the SDC protocol (Algorithm 2). Although we focus on the SDC protocol here, other protocols can be elucidated in a similar manner.
Theorem 4.
If the hash functions H ( · ) and Paillier’s homomorphic cryptosystem are secure, then the S D C protocol is secure. That is, for any probability polynomial-time adversary A , there exists a simulator S such that the probability Pr ( Real SDC A ) Pr ( Sim SDC A ) is negligible, i.e.,
Pr ( Real SDC A ) Pr ( Sim SDC A ) negl ( λ ) .
Proof. 
We first define the real view Real S D C A and the simulated view Sim S D C A .
Real S D C A : With the inputs E ( Q ) = ( E ( x q ) , E ( y q ) ) , ( E ( X / m ) , E ( Y / m ) ) , and a security parameter τ < κ σ 2 1 . CS1 first generates two random numbers r 1 , r 2 Z N and then calculates x = E ( x q ) r 1 × E ( X / m ) r 1 r 2 , u = E ( X / m ) r 1 , y = E ( y q ) r 1 × E ( Y / m ) r 1 r 2 , v = E ( Y / m ) r 1 . Then, it sends ( x , y ) and ( u , v ) to CS 2 . CS 2 first decrypts them as ( x , y ) = ( D ( x ) , D ( y ) ) and ( u , v ) = ( D ( u ) , D ( v ) ) and calculates the quotients d x = x u and d y = y v . Then, CS 2 encrypts d x = E ( d x ) , d y = E ( d y ) and sends ( d x , d y ) to CS1. Finally, CS 1 calculates the quotients E ( x q X / m ) = d x × E ( r 2 ) N 1 and E ( y q Y / m ¯ ) = d y × E ( r 2 ) N 1 .
Sim S D C A : The simulation contains two simulators { S 1 , S 2 } . S 1 first chooses four random numbers x ¯ q , y ¯ q , X / m ¯ , Y / m ¯ Z N and calculates E ¯ ( Q ) = ( E ( x ¯ q ) , E ( y ¯ q ) ) and ( E ( X / m ¯ ) , E ( Y / m ¯ ) ) . Then, it generates two random numbers r 1 and r 2 Z N and calculates ( x ¯ 1 , u ¯ 1 , y ¯ 1 , v ¯ 1 ) = ( E ( x ¯ q ) r 1 × E ( X / m ¯ ) r 1 r 2 , E ( X / m ¯ ) r 1 , E ( y ¯ q ) r 1 × E ( Y / m ¯ ) r 1 r 2 , E ( Y / m ¯ ) r 1 ) Z N 2 × Z N 2 × Z N 2 × Z N 2 . Finally, it sends ( x ¯ 1 , u ¯ 1 , y ¯ 1 , v ¯ 1 ) to S 2 . S 2 first chooses four random numbers ( x ¯ 2 , u ¯ 2 , y ¯ 2 , v ¯ 2 ) Z N 2 × Z N 2 × Z N 2 × Z N 2 and decrypts them as ( x ¯ 2 , y ¯ 2 , u ¯ 2 , v ¯ 2 ) = ( D ( x ¯ 2 ) , D ( y ¯ 2 ) , D ( u ¯ 2 ) , D ( v ¯ 2 ) ) Z N × Z N × Z N × Z N . Then, S 2 calculates the quotients d ¯ x = x ¯ 2 u ¯ 2 and d ¯ y = y ¯ 2 v ¯ 2 and sends ( d ¯ x , d ¯ y ) = ( E ( d ¯ x ) , E ( d ¯ y ) ) to S 1 . Finally, S 1 calculates the quotients E ( x ¯ q X / m ) = d ¯ x × E ( r 2 ) N 1 and E ( y ¯ q Y / m ) = d ¯ y × E ( r 2 ) N 1 .
Since Paillier’s homomorphic cryptosystem is semantically secure, for any two invalid plaintexts x ( 0 ) and x ( 1 ) , no probability polynomial-time (PPT) adversary can distinguish their ciphertexts E ( x ( 0 ) ) and E ( x ( 1 ) ) . That is, ( x ¯ 2 , y ¯ 2 , u ¯ 2 , v ¯ 2 ) = ( D ( x ¯ 2 ) , D ( y ¯ 2 ) , D ( u ¯ 2 ) , D ( v ¯ 2 ) ) in the simulated view is computationally indistinguishable from ( x , y , u , v ) in the actual view. Similarly, ( x ¯ 2 , y ¯ 2 , u ¯ 2 , v ¯ 2 ) is computationally indistinguishable from ( x , y , u , v ) . Therefore, the output distribution of the simulated view and that of the real view are computationally indistinguishable. In other words, the adversary cannot trace back to the corresponding data records, which preserves the privacy of the dataset and access patterns. To sum up, the SDC protocol is secure. □
Similarly, we can prove that the kNN (Algorithm 1), SGC (Algorithm 3), SNN (Algorithm 4) and SCR (Algorithm 5) protocols are secure under our security model; thus, according to the composition theorem, we can obtain the following theorem.
Theorem 5.
If the hash functions H ( · ) and Paillier’s homomorphic cryptosystem are secure, our protocol achieves dataset privacy, query data privacy, query result privacy, and access pattern privacy.
Remark 2.
For clarity, we use access pattern privacy as an example and provide an intuitive explanation. That is, the identifiers corresponding to the k-nearest neighbors of the query point are kept confidential from both the cloud servers and the querier  QU . In fact, during the kNN search process (as described in Algorithm 1), cloud server   CS 1 receives the encrypted data { ( E ( i d min ( j ) ) , E ( x i d m i n ( j ) ) , E ( y i d min ( j ) ) ) | j = 1 , , k } , while cloud server   CS 2 receives the blinded data { ( i d min ( j ) , x i d min ( j ) , y i d min ( j ) ) | j = 1 , , k } . Here, ( i d min ( j ) , x i d min ( j ) , y i d min ( j ) ) = ( i d min ( j ) , x i d min ( j ) , y i d min ( j ) ) + ( r 1 ( j ) , r 2 ( j ) , r 3 ( j ) ) mod N , where ( r 1 ( j ) , r 2 ( j ) , r 3 ( j ) ) Z N × Z N × Z N are uniformly random numbers owned by   CS 1 . The decryption key s k 1 is owned by   CS 2 , and since   CS 1 and   CS 2 are assumed to be non-collusive, the identifier i d ( j ) remains concealed from both cloud servers. The privacy of i d ( j ) against the querier  QU is clear. As shown in Steps 25 and 26 of Algorithm 1, the query results R 1 , VO 1 and R 2 , VO 2 received by the  QU contain no information about i d ( j ) .

6. Efficiency Analysis and Performance Evaluation

In this section, we present a comprehensive efficiency evaluation of our scheme from both theoretical and practical perspectives.

6.1. Evaluation Methodology

A widely recognized methodology for assessing the efficiency of a new scheme is to compare it against prior designs. However, it is unfair and meaningless to compare the efficiency of two schemes without considering the different system models and security intentions. Ideally, a well-constructed scheme should satisfy two conditions: (1) it outperforms earlier designs that offer the same or fewer security guarantees, and (2) any previous designs delivering higher security levels should exhibit significantly lower efficiency.
Since our new scheme simultaneously considers two secure functionalities—privacy and verifiability—Table 1 highlights existing schemes that also address these functionalities concurrently. These schemes include those proposed by Rong et al. [25], Sundarapandi et al. [7], Liu et al. [3], Zhang et al. [4], Cui et al. [9], and Cui et al. [8]. It is worth noting that Rong et al.’s and Liu et al.’s schemes fail to preserve the privacy of data access patterns. Furthermore, the verification approaches in Rong et al.’s scheme [25] and Sundarapandi et al.’s scheme [7] are probabilistic. Zhang et al.’s scheme [4], on the other hand, only supports verifying the authenticity of the query results. As for Cui et al.’s scheme [9], the authors themselves evaluated it in [8] and found it to have lower efficiency and unsatisfactory performance. Additionally, Cui et al. [8] comprehensively evaluated their proposed scheme called MSV k NN , demonstrating its efficiency advantages over previous designs. Therefore, based on our two comparison principles, we only need to evaluate the efficiency of our scheme relative to the currently most efficient scheme [8].

6.2. Theoretical Analysis

For ease of description, we first introduce some necessary notations. The time complexities of encryption and decryption in Paillier’s HE cryptosystem and DT-PKC [28] are almost the same: O ( log N ) Mul s . Let Mul and Div denote the time cost of one multiplication in Z N or Z N 2 and the time cost of one division of two integers less than N, respectively. Considering that the encryption and decryption time complexities in both Paillier’s HE cryptosystem and DT-PKC [28] are nearly identical, both at O ( log N ) Mul s , we denote the time cost of a single encryption operation as Enc and the time cost of a single decryption operation as Dec in either Paillier’s HE cryptosystem or DT-PKC. Moreover, Sig and Ver refer to the time cost of one signature operation and oce verification operation in the DSA, respectively. With these notations, we initially analyze the theoretical computational and communication costs of each algorithm in Table 3 and Table 4, respectively. Subsequently, we compare the theoretical computational cost of our protocol with that of Cui et al.’s protocol [8] in Table 5, along with the communication cost in Table 6.

6.3. Experimental Analysis

To comprehensively evaluate the practical performance of the proposed scheme, we conducted experimental comparisons of the time and communication costs between our design and Cui et al.’s protocol [8] across multiple dimensions, including but not limited to the dataset size, grid granularity m, query parameter k in kNN, and size of security parameter modulus.
All experiments were conducted on a laptop featuring an Intel® CoreTM i5-8250U CPU (1.60 GHz, with eight logical cores, Hewlett-Packard, Palo Alto, CA, USA) with 8GB of RAM, running on Windows 10. The implementations were developed in Java using the JCA Library. Furthermore, we adopted the NIST-recommended parameters for the Digital Signature Algorithm (DSA), where the prime modulus p and subgroup order q were configured with bit lengths of 1024 and 160, respectively. Subsequently, we analyzed the impact of the following parameters:
(1)
Impact of varying n: With fixed parameters m = 16 , k = 5 , and K = 1024 , we systematically varied the dataset size n from 1000 to 20,000 to evaluate scalability. Table 7 presents the stage-wise execution times for both Cui et al.’s protocol [8] and our proposed protocol. Visually, Figure 3 further illustrates the comparative trends in the total cost of these two protocols as n increases. The results demonstrate that our protocol achieved a 58.5–65.5% reduction in time cost compared to the baseline, with the performance gap widening significantly for larger n.
(2)
Impact of varying m: Under fixed parameters n = 2000 , k = 5 , and K = 1024 , we systematically evaluated the grid granularity m { 4 , 8 , 12 , 16 , 32 , 64 } to analyze algorithmic scalability. Table 8 presents a comparative analysis of computational latency (in seconds) between Cui et al.’s protocol [8] and our proposed method across these configurations. As shown in the table, the total cost of our design was about 32.7–35.4% of that of the baseline. Furthermore, as the grid granularity m primarily influences the search stages, Figure 4 visualizes the combined latencies of these phases. Notably, minimal computational overhead was achieved at m = 8 , aligning closely with the theoretical optimum derived for uniform random datasets:
m 2 n m n 1 / 4 = 2000 1 / 4 6.68 .
The empirically observed optimum ( m = 8 ) reflects practical implementation constraints while remaining consistent with this theoretical boundary.
(3)
Impact of varying k: As shown in Table 9, under fixed parameters n = 2000 , m = 16 , and K = 1024 , we systematically evaluated the computational efficiency of each stage of our protocol against Cui et al.’s baseline [8] by varying the query parameter k in k-nearest-neighbor (kNN) searches from 1 to 10. Further, since the search, verification, and decryption stages are inherently dependent on k, whereas the setup, dataset encryption, and query encryption stages remain protocol-level invariants independent of k, Figure 5 illustrates the variance of the time cost of these two stages as k increases, demonstrating that the efficiency gains of our design became more pronounced as k increased.
(4)
Impact of varying K: Given that the modulus size K of N determines the security strength of both the DT-PKC and Paillier cryptosystems employed in our scheme, we conducted a comprehensive performance comparison between our proposed scheme and Cui et al.’s protocol [8] under varying security levels ( K = 512 , 1024 , 2048 ) for fixed parameters m = 16 , k = 5 , and n = 2000 . Table 10 presents the stage-wise computational latencies (e.g., setup, encryption, search, and verification) for both schemes, explicitly quantifying the trade-off between cryptographic robustness and operational efficiency. Also, Figure 6 illustrates the total execution time scaling with increasing K, showing that the total cost of our design was about 33.8–38.1% of that of Cui et al.’s protocol.
(5)
Communication cost: We conducted an experimental evaluation of communication costs with fixed parameters, m = 16 , k = 5 , and K = 1024 , while systematically varying the dataset size n. As shown in our theoretical analysis (Table 6), the primary difference in communication cost between our protocol and Cui et al.’s protocol occurred during the Search stage. Figure 7 illustrates the difference in communication overhead between CS 1 and CS 2 in our protocol compared to Cui et al.’s protocol across various dataset sizes. The polyline in Figure 8 represents the comparison of data transfer times between CS1 and CS2 during the Search phase across varying dataset sizes. The data transfer time (communication latency) was calculated as the communication volume divided by the transfer rate, with the transfer rate simulated as 390 Mbps ≈ 48.75 MB/s. Our findings demonstrated that our protocol incurred lower costs, consistent with our theoretical analysis.

7. Conclusions

This paper presents our investigation and proposal of a faster, privacy-preserving, and publicly verifiable protocol for exact k-nearest-neighbor queries, termed PPVkNN. Leveraging Paillier’s homomorphic encryption and a series of meticulously designed secure protocols, PPVkNN not only supports exact kNN query functionality but also preserves the privacy of data, queries, results, and query access patterns. Furthermore, it guarantees result correctness and enables public verification. Theoretical analysis confirms the correctness and security of our proposed protocols. Additionally, efficiency analysis and performance evaluation demonstrate significant computational and communication savings compared to prior works, enhancing the practicality of our scheme.

Author Contributions

J.L. and C.T.: Conceptualization, methodology, validation, investigation, resources, supervision, project administration, visualization, and writing—original draft preparation; Y.S. and W.T.: Formal analysis, data curation, writing—review and editing, and visualization. All authors have read and agreed to the published version of the manuscript.

Funding

This research was partially supported by the Natural Science Foundation of Shandong Province (ZR2022MF250), the National Natural Science Foundation of China (61702294), and the Natural Science Foundation of Top Talent of SZTU (GDRC202214).

Data Availability Statement

The raw data supporting the conclusions of this article will be made available by the authors on request.

Acknowledgments

The authors would like to thank the editor and the three anonymous referees for their careful reading of this article and their constructive suggestions, which considerably improved this article.

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A

Table A1. Notations.
Table A1. Notations.
ParameterDescription
ρ a permutation function
D the plaintext spatial dataset
ED the ciphertext spatial dataset
L C M ( a , b ) the least common multiple of two integers a and b
Q Q = ( x q , y q ) is the query data point
i d min ( j ) the index of the jth nearest neighbor to Q
P i d min ( j ) P i d min ( j ) = ( x i d min ( j ) , y i d min ( j ) ) denotes the jth
nearest neighbor to Q
V R V ( P i d min ( j ) ) the set of Voronoi-relevant vectors of P i d min ( j )
H ( · ) a cryptographic hash function
Sig ( · ) the signature algorithm in DSA
Na large integer that is the product of two
prime numbers p and q
Kthe size of N
Z N 2 the multiplication group of the residue
class modulo N 2
Z N the residue class ring modulo N
x | | y the concatenation of two numbers x and y
x the greatest integer, no larger than x
[ n ] the set { 0 , , n 1 }
[ m ] the set { 0 , , m 1 }
τ the security parameter
negl ( · ) a negligible function of the security parameter
nthe dataset size
mthe grid granularity
kthe query parameter
ω the number of lines in one bucket
λ the number of packed points

Appendix B

Appendix B.1. The Proof of Lemma 3

Proof. 
In Algorithm 2, due to the homomorphic property of Paillier’s cryptosystem, we know that
x = E ( x q ) r 1 × E ( X / m ) r 1 × r 2 = E ( r 1 x q + r 1 r 2 X / m ) , u = E ( X / m ) r 1 = E ( r 1 X / m ) , y = E ( y q ) r 1 × E ( Y / m ) r 1 × r 2 = E ( r 1 y q + r 1 r 2 Y / m ) , v = E ( Y / m ) r 1 = E ( r 1 Y / m ) .
Thus, x = D ( x ) = r 1 x q + r 1 r 2 X / m mod N ,   y = D ( y ) = r 1 y q + r 1 r 2 Y / m mod N ,   u = D ( u ) = r 1 X / m mod N , v = D ( v ) = r 1 Y / m mod N . Since r 1 and r 2 are at most τ bits, and x q and X / m are at most σ bits, while N is 2 κ bits, with the security parameter satisfying τ < κ σ + 1 2 , we have x = D ( x ) = r 1 x q + r 1 r 2 X / m , u = D ( u ) = r 1 X / m , y = D ( y ) = r 1 y q + r 1 r 2 Y / m , v = D ( v ) = r 1 Y / m . Then,
d x = x u = r 1 x q + r 1 r 2 X / m r 1 X / m = x q X / m + r 2 , d y = y u = r 1 y q + r 1 r 2 Y / m r 1 Y / m = y q Y / m + r 2 .
Consequently, the output is
d x × E ( r 2 ) N 1 = E x q X / m + r 2 × E ( r 2 ) N 1 = E x q X / m + r 2 r 2 = E x q X / m , d y × E ( r 2 ) N 1 = E y q Y / m + r 2 E ( r 2 ) N 1 = E y q Y / m + r 2 r 2 = E y q Y / m .

Appendix B.2. The Proof of Lemma 4

Proof. 
According to the homomorphic property of Paillier’s cryptosystem and the data-packing technique, from Step 2 of Algorithm 3, we know that
Δ x j = E x q X / m × E ( j ) N 1 × E ( T ) r x j = E r x j x q X / m j + T , Δ y j = E y q Y / m × E ( j ) N 1 × E ( T ) r y j = E r y j x q X / m j + T ,
and, in Step 9 of Algorithm 3, we have E ( G s t ) = E ( G s t ) × E ( r s t ) = E G s t + r s t . Thus, in Step 11,
Δ x = ρ 1 ( Δ x ) = ( Δ x ρ 1 ( 0 ) , , Δ x ρ 1 ( m 1 ) ) = ( E ( r x ρ 1 ( 0 ) ( x q X / m ρ 1 ( 0 ) + T ) ) , , E ( r x ρ 1 ( m 1 ) ( x q X / m ρ 1 ( m 1 ) + T ) ) ) , Δ y = ρ 2 ( Δ y ) = ( Δ y ρ 2 ( 0 ) , , Δ y ρ 2 ( m 1 ) ) = ( E ( r y ρ 2 ( 0 ) ( y q Y / m ρ 2 ( 0 ) + T ) ) , , E ( r y ρ 2 ( m 1 ) ( y q Y / m ρ 2 ( m 1 ) + T ) ) ) , Γ = ρ 2 ( ρ 1 ( E ( G ) ) ) = E ( G ρ 1 ( s ) ρ 2 ( t ) ) 0 s , t m 1 = ( E ( G ρ 1 ( 0 ) ρ 2 ( 0 ) ) , , E ( G ρ 1 ( m 1 ) ρ 2 ( m 1 ) ) ) ,
and, in Step 14,
v x = i = 0 m 1 Δ x ρ 1 ( i ) 2 σ ( m ( i + 1 ) ) = ( Δ x ρ 1 ( 0 ) ) 2 σ ( m 1 ) ( Δ x ρ 1 ( m 1 ) ) = E i = 0 m 1 ( r x ρ 1 ( i ) ( x q X / m ρ 1 ( i ) + T ) ) 2 σ ( m ( i + 1 ) ) , v y = i = 0 m 1 Δ y ρ 2 ( i ) 2 σ ( m ( i + 1 ) ) = ( Δ y ρ 2 ( 0 ) ) 2 σ ( m 1 ) ( Δ y ρ 2 ( m 1 ) ) = E i = 0 m 1 ( r y ρ 2 ( i ) ( y q Y / m ρ 2 ( i ) + T ) ) 2 σ ( m ( i + 1 ) ) .
Also, in Step 15, according to the property of the data-packing technique, as long as
0 < r x ρ 1 ( i ) ( x q X / m ρ 1 ( i ) + T ) < 2 σ ,
0 < r y ρ 2 ( i ) ( y q Y / m ρ 2 ( i ) + T ) < 2 σ ,
i = 0 m 1 ( r x ρ 1 ( i ) ( x q X / m ρ 1 ( i ) + T ) ) 2 σ ( m ( i + 1 ) ) < N ,
i = 0 m 1 ( r y ρ 2 ( i ) ( y q Y / m ρ 2 ( i ) + T ) ) 2 σ ( m ( i + 1 ) ) < N ,
we have
( v x ρ 1 ( 0 ) , , v x ρ 1 ( m 1 ) ) = D ( v x ) = ( ( x q X / m ρ 1 ( 0 ) + T ) r x ρ 1 ( 0 ) , , ( x q X / m ρ 1 ( m 1 ) + T ) r x ρ 1 ( m 1 ) ) , ( v y ρ 2 ( 0 ) , , v y ρ 2 ( m 1 ) ) = D ( v y ) = ( ( y q Y / m ρ 2 ( 0 ) + T ) r y ρ 2 ( 0 ) , , ( y q Y / m ρ 2 ( m 1 ) + T ) r y ρ 2 ( m 1 ) ) .
Since r x j and r y j ( 0 j m 1 ) are at most τ bits, and x q , y q , X / m , Y / m and m are at most σ bits, with T m and 2 τ + σ < 2 σ , 2 σ m < N , the conditions (Equations (A1)–(A4)) hold. Subsequently, in Step 18, the decisional conditions v x ρ 1 ( s ) mod T = 0 and v y ρ 2 ( t ) mod T = 0 mean that x q X / m = ρ 1 ( s ) and y q Y / m = ρ 2 ( t ) . Consequently, in Step 25,
E ( r ) = s = 0 m 1 t = 0 m 1 ( M s t ) r s t = s = 0 m 1 t = 0 m 1 ( M ρ 1 1 ( s ) ρ 2 1 ( t ) ) r s t = s = 0 m 1 t = 0 m 1 ( M s t ) r ρ 1 ( s ) ρ 2 ( t ) = E s = 0 m 1 t = 0 m 1 r ρ 1 ( s ) ρ 2 ( t ) δ s t = E ( r s ^ t ^ )
with
s ^ = x q X / m , t ^ = y q Y / m and
δ s t = 1 ( ρ 1 ( s ) , ρ 2 ( t ) ) = ( s ^ , t ^ ) = ( x q X / m , y q Y / m ) 0 o t h e r w i s e .
In the last step (Step 26),
Γ × E ( r ) N 1 = E G ρ 1 ( s ) ρ 2 ( t ) × E ( r s ^ t ^ ) N 1 = E G x q X / m , y q Y / m + r x q X / m , y q Y / m r s ^ t ^ = E ( G s ^ t ^ ) .

Appendix B.3. The Proof of Lemma 5

Proof. 
Due to the homomorphic properties of Paillier’s cryptosystem and the data-packing technique, we know that
v 0 = ( E ( G s t ) × E ( Φ 0 ) ) r 0 = ( E ( i 1 ( s t ) | x i 1 ( s t ) | y i 1 ( s t ) | | i λ ( s t ) | x i λ ( s t ) | y i λ ( s t ) ) × E r 1 | r 2 | | r 3 λ ) r 0 = E ( r 0 ( i 1 ( s t ) + r 1 ) 2 σ ( 3 λ 1 ) + r 0 ( x i 1 ( s t ) + r 2 ) 2 σ ( 3 λ 2 ) + r 0 ( y i 1 ( s t ) + r 3 ) 2 σ ( 3 λ 3 ) + + r 0 ( y i λ ( s t ) + r 3 λ ) 2 0 ) , v 1 = ( E ( Q ) × E ( Φ 1 ) ) r 0 = ( E x q | y q | | x q | y q | | x q | y q × E r 2 | r 3 | | r 3 k 1 | r 3 k | | r 3 λ 1 | r 3 λ ) r 0 = E ( r 0 ( x q + r 2 ) 2 σ ( 2 λ 1 ) + r 0 ( y q + r 3 ) 2 σ ( 2 λ 2 ) + + r 0 ( y q + r 3 λ ) 2 0 ) , v 2 = ( E ( i d ) × E ( Φ 2 ) ) r 0 = ( E i d | i d | i d | | i d × E ( r 1 | r 4 | r 7 | | r 3 λ 2 ) ) r 0 = E ( r 0 ( i d + r 1 ) 2 σ ( λ 1 ) + r 0 ( i d + r 4 ) 2 σ ( λ 2 ) + r 0 ( i d + r 7 ) 2 σ ( λ 3 ) + + r 0 ( i d + r 3 λ 2 ) 2 0 ) .
Thus,
v 0 = D ( v 0 ) = r 0 ( i 1 ( s t ) + r 1 ) 2 σ ( 3 λ 1 ) + r 0 ( x i 1 ( s t ) + r 2 ) 2 σ ( 3 λ 2 ) + r 0 ( y i 1 ( s t ) + r 3 ) 2 σ ( 3 λ 3 ) + + r 0 ( y i λ ( s t ) + r 3 λ ) 2 0 mod N , v 1 = D ( v 1 ) = r 0 ( x q + r 2 ) 2 σ ( 2 λ 1 ) + r 0 ( y q + r 3 ) 2 σ ( 2 λ 2 ) + + r 0 ( y q + r 3 λ ) 2 0 mod N , v 2 = D ( v 2 ) = r 0 ( i d + r 1 ) 2 σ ( λ 1 ) + r 0 ( i d + r 4 ) 2 σ ( λ 2 ) + r 0 ( i d + r 7 ) 2 σ ( λ 3 ) + + r 0 ( i d + r 3 λ 2 ) 2 0 mod N .
Since r j ( 0 j 3 λ ) are at most τ bits, i j ( s t ) , x i j ( s t ) , y i j ( s t ) ( 1 j λ ) , x q , y q and i d are at most σ bits, and the security parameters τ and σ satisfy 2 τ + σ < 2 σ and 2 σ 3 λ < N , we have
v 0 = D ( v 0 ) = r 0 ( i 1 ( s t ) + r 1 ) 2 σ ( 3 λ 1 ) + r 0 ( x i 1 ( s t ) + r 2 ) 2 σ ( 3 λ 2 ) + r 0 ( y i 1 ( s t ) + r 3 ) 2 σ ( 3 λ 3 ) + + r 0 ( y i λ ( s t ) + r 3 λ ) 2 0 , v 1 = D ( v 1 ) = r 0 ( x q + r 2 ) 2 σ ( 2 λ 1 ) + r 0 ( y q + r 3 ) 2 σ ( 2 λ 2 ) + + r 0 ( y q + r 3 λ ) 2 0 , v 2 = D ( v 2 ) = r 0 ( i d + r 1 ) 2 σ ( λ 1 ) + r 0 ( i d + r 4 ) 2 σ ( λ 2 ) + r 0 ( i d + r 7 ) 2 σ ( λ 3 ) + + r 0 ( i d + r 3 λ 2 ) 2 0 , ID = { r 0 ( i 1 ( s t ) + r 1 ) , r 0 ( i 2 ( s t ) + r 4 ) , r 0 ( i 3 ( s t ) + r 7 ) , , r 0 ( i λ ( s t ) + r 3 λ 2 ) } P = ( r 0 ( x i 1 ( s t ) + r 2 ) , r 0 ( y i 1 ( s t ) + r 3 ) ) , ( r 0 ( x i 2 ( s t ) + r 5 ) , r 0 ( y i 2 ( s t ) + r 6 ) ) , , ( r 0 ( x i λ ( s t ) + r 3 λ 1 ) , r 0 ( y i λ ( s t ) + r 3 λ ) ) Q = { r 0 ( x q + r 2 ) , r 0 ( y q + r 3 ) , r 0 ( x q + r 5 ) , r 0 ( y q + r 6 ) , , r 0 ( x q + r 3 λ 1 ) , r 0 ( y q + r 3 λ ) } id = { r 0 ( i d + r 1 ) , r 0 ( i d + r 4 ) , r 0 ( i d + r 7 ) , , r 0 ( i d + r 3 λ 2 ) }
Subsequently, after the for-loop in Step 20, we have
d min = min 0 j λ 1 , i j + 1 ( s t ) i d r 0 2 ( x i j + 1 ( s t ) x q ) 2 + ( y i j + 1 ( s t ) y q ) 2 = r 0 2 ( x i p o s + 1 ( s t ) x q ) 2 + ( y i p o s + 1 ( s t ) y q ) 2 ,
and, in Step 28,
E ( r i d ) = i = 0 λ 1 δ [ i ] r 3 i + 1 = δ [ 0 ] r 1 δ [ 1 ] r 4 δ [ 2 ] r 7 δ [ λ 1 ] r 3 λ 2 = E ( r 1 · 0 + + r 3 p o s + 1 · 1 + + r 3 λ 2 · 0 ) = E ( r 3 p o s + 1 ) , E ( r x ) = i = 0 λ 1 δ [ i ] r 3 i + 2 = δ [ 0 ] r 2 δ [ 1 ] r 5 δ [ 2 ] r 8 δ [ λ 1 ] r 3 λ 1 = E ( r 2 · 0 + + r 3 p o s + 2 · 1 + + r 3 λ 1 · 0 ) = E ( r 3 p o s + 2 ) , E ( r y ) = i = 0 λ 1 δ [ i ] r 3 i + 3 = δ [ 0 ] r 3 δ [ 1 ] r 6 δ [ 2 ] r 9 δ [ λ 1 ] r 3 λ = E ( r 3 · 0 + + r 3 p o s + 3 · 1 + + r 3 λ · 0 ) = E ( r 3 p o s + 3 ) .
Therefore, in the last step (Step 29), by the homomorphic property,
E ( i d m i n ) = E ( ID [ p o s ] ) r 0 1 × E ( r i d ) N 1 = E ( r 0 1 ID [ p o s ] r i d ) = E ( r 0 1 ( r 0 ( i p o s + 1 ( s t ) + r 3 p o s + 1 ) ) r 3 p o s + 1 ) = E i p o s + 1 ( s t ) , E ( x i d m i n ) = E ( P [ p o s ] . x ) r 0 1 × E ( r x ) N 1 = E ( r 0 1 P [ p o s ] . x r x ) = E ( r 0 1 ( r 0 ( x i p o s + 1 ( s t ) + r 3 p o s + 2 ) ) r 3 p o s + 2 ) = E x i p o s + 1 ( s t ) , E ( y i d m i n ) = E ( P [ p o s ] . y ) r 0 1 × E ( r y ) N 1 = E ( r 0 1 P [ p o s ] . y r y ) = E ( r 0 1 ( r 0 ( y i p o s + 1 ( s t ) + r 3 p o s + 3 ) ) r 3 p o s + 3 ) = E y i p o s + 1 ( s t ) .

Appendix B.4. The Proof of Lemma 6

Proof. 
According to the homomorphic property, in Step 3, we know that for 0 j n w 1 ,
E ( η M j ) = ( E ( ( j + 1 ) w 1 ) × E ( r 0 j ) ) r 1 j = E ( r 1 j ( ( j + 1 ) w 1 + r 0 j ) ) , E ( η j ) = ( E ( i d min ) × E ( r 0 j ) ) r 1 j = E ( r 1 j ( i d min + r 0 j ) ) , E ( η m j ) = ( E ( j w ) × E ( r 0 j ) ) r 1 j = E ( r 1 j ( j w + r 0 j ) ) ,
and, in Step 7, for 0 j n w 1 and 0 k w 1 , we have
Ψ ( j w + k ) 0 = ( E ( j w + k ) × E ( i d min ) N 1 ) r ( j w + k ) 0 = E ( r ( j w + k ) 0 ( j w + k i d min ) ) , Ψ ( j w + k ) 1 = E ( V R V ( P j w + k ) ) × E ( r ( j w + k ) 1 ) = E ( V R V ( P j w + k ) + r ( j w + k ) 1 ) , Ψ ( j w + k ) 2 = E ( S i g ( P j w + k ) ) × E ( r ( j w + k ) 2 ) = E ( S i g ( P j w + k ) + r ( j w + k ) 2 ) .
Thus, in Step 13,
η M = ρ 1 ( η M ) = ( E ( η M ρ 1 ( 0 ) ) , , E ( η M ρ 1 ( n w 1 ) ) ) , η m = ρ 1 ( η m ) = ( E ( η m ρ 1 ( 0 ) ) , , E ( η m ρ 1 ( n w 1 ) ) ) , η = ρ 1 ( η ) = ( E ( η ρ 1 ( 0 ) ) , , E ( η ρ 1 ( n w 1 ) ) ) , B = ρ 2 ( ρ 1 ( B ) ) = Ψ 0 , Ψ 1 , Ψ 2 , Ψ 0 , Ψ 1 , Ψ 2 = Ψ ρ 1 ( 0 ) 0 , , Ψ ρ 1 n w 1 , 0 , Ψ ρ 1 ( 0 ) 1 , , Ψ ρ 1 n w 1 , 1 , Ψ ρ 1 ( 0 ) 2 , , Ψ ρ 1 n w 1 , 2 = Ψ ( ρ 1 ( j ) w + ρ 2 ( k ) ) 0 0 j ( n w 1 ) , 0 k w 1 , Ψ ( ρ 1 ( j ) w + ρ 2 ( k ) ) 1 0 j ( n w 1 ) , 0 k w 1 , Ψ ( ρ 1 ( j ) w + ρ 2 ( k ) ) 2 0 j ( n w 1 ) , 0 k w 1 ,
which results in the following in Step 19:
( η M ρ 1 ( 0 ) , , η M ρ 1 ( n w 1 ) ) = D ( η M ) , ( η m ρ 1 ( 0 ) , , η m ρ 1 ( n w 1 ) ) = D ( η m ) , ( η ρ 1 ( 0 ) , , η ρ 1 ( n w 1 ) ) = D ( η ) .
Consequently, in Step 22, the condition ( η M ρ 1 ( j ) η ρ 1 ( j ) ) 0 ( η m ρ 1 ( j ) η ρ 1 ( j ) ) < 0 is equivalent to
( r 1 ρ 1 ( j ) ( ( ρ 1 ( j ) + 1 ) w 1 + r 0 ρ 1 ( j ) ) ) mod N ( r 1 ρ 1 ( j ) ( i d min + r 0 ρ 1 ( j ) ) ) mod N 0 ,
( r 1 ρ 1 ( j ) ( ( ρ 1 ( j ) ) w + r 0 ρ 1 ( j ) ) ) mod N ( r 1 ρ 1 ( j ) ( i d min + r 0 ρ 1 ( j ) ) ) mod N < 0 .
Since r 0 j and r 1 j > 0 are at most τ bits, i d min and ( n w ) w 1 < n are at most σ bits, and the security parameter τ satisfies 2 τ + σ < N and 2 2 τ < N , Equations (A5) and (A6) are
r 1 ρ 1 ( j ) ( ( ρ 1 ( j ) + 1 ) w 1 + r 0 ρ 1 ( j ) ) r 1 ρ 1 ( j ) ( i d min + r 0 ρ 1 ( j ) ) 0 ( ρ 1 ( j ) + 1 ) w 1 i d min ,
r 1 ρ 1 ( j ) ( ( ρ 1 ( j ) ) w + r 0 ρ 1 ( j ) ) r 1 ρ 1 ( j ) ( i d min + r 0 ρ 1 ( j ) ) < 0 ρ 1 ( j ) w < i d min ,
which means that i d min lies in the bucket B ρ 1 ( j ) . Then, Step 23 decrypts the entry in each row of this bucket:
( ψ ( ρ 1 ( j ) w + ρ 2 ( 0 ) ) 0 , , ψ ( ρ 1 ( j ) w + ρ 2 ( w 1 ) ) 0 ) = D ( Ψ ρ 1 ( j ) 0 ) .
Thus, the condition in Step 25 is ψ ( ρ 1 ( j ) w + ρ 2 ( k ) ) 0 = r ( ρ 1 ( j ) w + ρ 2 ( k ) ) 0 ( ρ 1 ( j ) w + ρ 2 ( k ) i d min ) = 0 , which means that i d min = ρ 1 ( j ) w + ρ 2 ( k ) . At this point, we record the ciphertexts of the V R V ( P i d min ) and the signature in Step 26:
Θ = Ψ ( ρ 1 ( j ) w + ρ 2 ( k ) ) 1 , Ψ ( ρ 1 ( j ) w + ρ 2 ( k ) ) 2 = Ψ ( i d min ) 1 , Ψ ( i d min ) 2 = E ( V R V ( P i d min ) + r i d min 1 ) , E ( S i g ( P i d min ) + r i d min 1 ) ) .
Consequently, in Step 30,
E ( ψ ( 1 ) ) = j = 0 n w 1 k = 0 w 1 ( M j k ) r ( j w + k ) 1 = j = 0 n w 1 k = 0 w 1 ( M ρ 1 1 ( j ) ρ 2 1 ( k ) ) r ( j w + k ) 1 = j = 0 n w 1 k = 0 w 1 ( M j k ) r ( ρ 1 ( j ) w + ρ 2 ( k ) ) 1 = E j = 0 n w 1 k = 0 w 1 δ j k r ( ρ 1 ( j ) w + ρ 2 ( k ) ) 1 = E ( r i d min 1 ) ,
E ( ψ ( 2 ) ) = j = 0 n w 1 k = 0 w 1 ( M j k ) r ( j w + k ) 2 = j = 0 n w 1 k = 0 w 1 ( M ρ 1 1 ( j ) ρ 2 1 ( k ) ) r ( j w + k ) 2 = j = 0 n w 1 k = 0 w 1 ( M j k ) r ( ρ 1 ( j ) w + ρ 2 ( k ) ) 2 = E j = 0 n w 1 k = 0 w 1 δ j k r ( ρ 1 ( j ) w + ρ 2 ( k ) ) 2 = E ( r i d min 2 )
with δ j k = 1 ρ 1 ( j ) w + ρ 2 ( k ) = i d min 0 o t h e r w i s e , and, in the last step (Step 31),
Θ 1 × E ( ψ ( 1 ) ) N 1 = E ( V R V ( P i d min ) + r i d min 1 ) × E ( r i d min 1 ) N 1 = E ( V R V ( P i d min ) ) , Θ 2 × E ( ψ ( 2 ) ) N 1 = E ( S i g ( P i d min ) + r i d min 2 ) × E ( r i d min 2 ) N 1 = E ( S i g ( P i d min ) ) .

Appendix B.5. The Proof of Lemma 7

Proof. 
Given the correctness established by Lemmas 3–5 for Algorithms 2–4, it follows that in Step 4, the two-tuple ( E ( x i d m i n ( 1 ) ) , E ( y i d m i n ( 1 ) ) ) represents the encrypted nearest-neighbor point to Q. Due to the homomorphic property of Paillier’s cryptosystem, after Steps 5–7, ( x i d min ( 1 ) , y i d min ( 1 ) ) = ( x i d min ( 1 ) + r 2 ( 1 ) , y i d min ( 1 ) + r 3 ( 1 ) ) , and P i d min ( 1 ) = ( x i d min ( 1 ) , y i d min ( 1 ) ) is the nearest neighbor to Q. Also, by Lemma 6 and the homomorphic property, after Steps 8–11, we have S i g ( P i d min ( 1 ) ) = S i g ( P i d min ( 1 ) ) + r 5 ( 1 ) and
V R V ( P i d min ( 1 ) ) = ( i d 1 ( 1 ) + r 4 , 1 ( 1 ) ) 2 σ ( 3 L 1 ) + ( x i d 1 ( 1 ) + r 4 , 2 ( 1 ) ) 2 σ ( 3 L 2 ) + ( y i d 1 ( 1 ) + r 4 , 3 ( 1 ) ) 2 σ ( 3 L 3 ) + + ( y i d L ( 1 ) + r 4 , 3 L ( 1 ) ) 2 0 .
Thus, after the unpacking operation in Step 12,
i d 1 ( 1 ) , P i d 1 ( 1 ) , , i d L ( 1 ) , P i d L ( 1 ) = i d 1 ( 1 ) + r 4 , 1 ( 1 ) , P i d 1 ( 1 ) + ( r 4 , 2 ( 1 ) , r 4 , 3 ( 1 ) ) , , i d L ( 1 ) + r 4 , 3 L 2 ( 1 ) , P i d L ( 1 ) + ( r 4 , 3 L 1 ( 1 ) , r 4 , 3 L ( 1 ) ) .
Through a similar analysis, we can also prove the cases for j = 2 , , k . □

References

  1. Wang, J.; Chen, X. Efficient and Secure Storage for Outsourced Data: A Survey; Springer: Berlin/Heidelberg, Germany, 2016; Volume 1, pp. 178–188. [Google Scholar] [CrossRef]
  2. Lei, X.; Liu, A.X.; Li, R.; Tu, G.H. SecEQP: A Secure and Efficient Scheme for SkNN Query Problem Over Encrypted Geodata on Cloud. In Proceedings of the 2019 IEEE 35th International Conference on Data Engineering (ICDE), Macao, China, 8–11 April 2019; pp. 662–673. [Google Scholar] [CrossRef]
  3. Liu, Q.; Hao, Z.; Peng, Y.; Jiang, H.; Wu, J.; Peng, T.; Wang, G.; Zhang, S. SecVKQ: Secure and verifiable kNN queries in sensor–cloud systems. J. Syst. Archit. 2021, 120, 102300. [Google Scholar] [CrossRef]
  4. Zhang, Y.; Wang, B.; Zhao, Z. Secure k-NN Query With Multiple Keys Based on Random Projection Forests. IEEE Internet Things J. 2024, 11, 15205–15218. [Google Scholar] [CrossRef]
  5. Qi, J.; Jia, X.; Luo, M.; Feng, Q. A Privacy-Aware K-Nearest Neighbor Query Scheme for Location-Based Services. IEEE Internet Things J. 2024, 11, 10831–10842. [Google Scholar] [CrossRef]
  6. Cheng, K.; Wang, L.; Shen, Y.; Wang, H.; Wang, Y.; Jiang, X.; Zhong, H. Secure k-NN Query on Encrypted Cloud Data with Multiple Keys. IEEE Trans. Big Data 2021, 7, 689–702. [Google Scholar] [CrossRef]
  7. Sundarapandi, G.P.; Bokhary, S.; Samanthula, B.K.; Dong, B. A Probabilistic Approach for Secure and Verifiable Computation of kNN Queries in Cloud. In Proceedings of the 2023 IEEE Cloud Summit, Baltimore, MD, USA, 6–7 July 2023; pp. 15–20. [Google Scholar] [CrossRef]
  8. Cui, N.; Qian, K.; Cai, T.; Li, J.; Yang, X.; Cui, J.; Zhong, H. Towards Multi-User, Secure, and Verifiable kNN Query in Cloud Database. IEEE Trans. Knowl. Data Eng. 2023, 35, 9333–9349. [Google Scholar] [CrossRef]
  9. Cui, N.; Yang, X.; Wang, B.; Li, J.; Wang, G. SVkNN: Efficient Secure and Verifiable k-Nearest Neighbor Query on the Cloud Platform. In Proceedings of the 2020 IEEE 36th International Conference on Data Engineering (ICDE), Dallas, TX, USA, 20–24 April 2020; pp. 253–264. [Google Scholar] [CrossRef]
  10. Oliveira, S.R.; Zaiane, O.R. Privacy preserving clustering by data transformation. J. Inf. Data Manag. 2010, 1, 37. [Google Scholar]
  11. Wong, W.K.; Cheung, D.W.l.; Kao, B.; Mamoulis, N. Secure kNN computation on encrypted databases. In Proceedings of the 2009 ACM SIGMOD International Conference on Management of Data, New York, NY, USA, 29 June–2 July 2009; SIGMOD ’09. pp. 139–152. [Google Scholar] [CrossRef]
  12. Hu, H.; Xu, J.; Ren, C.; Choi, B. Processing private queries over untrusted data cloud through privacy homomorphism. In Proceedings of the 2011 IEEE 27th International Conference on Data Engineering, Hannover, Germany, 11–16 April 2011; pp. 601–612. [Google Scholar] [CrossRef]
  13. Yao, B.; Li, F.; Xiao, X. Secure nearest neighbor revisited. In Proceedings of the 2013 IEEE 29th International Conference on Data Engineering (ICDE), Brisbane, QLD, Australia, 8–12 April 2013; pp. 733–744. [Google Scholar] [CrossRef]
  14. Choi, S.; Ghinita, G.; Lim, H.S.; Bertino, E. Secure kNN Query Processing in Untrusted Cloud Environments. IEEE Trans. Knowl. Data Eng. 2014, 26, 2818–2831. [Google Scholar] [CrossRef]
  15. Wang, B.; Hou, Y.; Li, M. QuickN: Practical and Secure Nearest Neighbor Search on Encrypted Large-Scale Data. IEEE Trans. Cloud Comput. 2022, 10, 2066–2078. [Google Scholar] [CrossRef]
  16. Popa, R.A.; Li, F.H.; Zeldovich, N. An Ideal-Security Protocol for Order-Preserving Encoding. In Proceedings of the 2013 IEEE Symposium on Security and Privacy, Berkeley, CA, USA, 19–22 May 2013; pp. 463–477. [Google Scholar] [CrossRef]
  17. Zhu, Y.; Xu, R.; Takagi, T. Secure k-NN computation on encrypted cloud data without sharing key with query users. In Proceedings of the 2013 International Workshop on Security in Cloud Computing, Hangzhou, China, 8 May 2013; Cloud Computing ’13. pp. 55–60. [Google Scholar] [CrossRef]
  18. Zhu, Y.; Huang, Z.; Takagi, T. Secure and controllable k-NN query over encrypted cloud data with key confidentiality. J. Parallel Distrib. Comput. 2016, 89, 1–12. [Google Scholar] [CrossRef]
  19. Lei, X.; Tu, G.H.; Liu, A.X.; Xie, T. Fast and Secure kNN Query Processing in Cloud Computing. In Proceedings of the 2020 IEEE Conference on Communications and Network Security (CNS), Avignon, France, 29 June–1 July 2020; pp. 1–9. [Google Scholar] [CrossRef]
  20. Li, R.; Liu, A.X.; Xu, H.; Liu, Y.; Yuan, H. Adaptive Secure Nearest Neighbor Query Processing Over Encrypted Data. IEEE Trans. Dependable Secur. Comput. 2022, 19, 91–106. [Google Scholar] [CrossRef]
  21. Zheng, Y.; Lu, R.; Zhang, S.; Shao, J.; Zhu, H. Achieving Practical and Privacy-Preserving kNN Query over Encrypted Data. In IEEE Transactions on Dependable and Secure Computing; IEEE: Piscataway, NJ, USA, 2024; pp. 1–13. [Google Scholar] [CrossRef]
  22. Elmehdwi, Y.; Samanthula, B.K.; Jiang, W. Secure k-nearest neighbor query over encrypted data in outsourced environments. In Proceedings of the 2014 IEEE 30th International Conference on Data Engineering, Chicago, IL, USA, 31 March–4 April 2014; pp. 664–675. [Google Scholar] [CrossRef]
  23. Guan, Y.; Lu, R.; Zheng, Y.; Shao, J.; Wei, G. Toward Oblivious Location-Based k-Nearest Neighbor Query in Smart Cities. IEEE Internet Things J. 2021, 8, 14219–14231. [Google Scholar] [CrossRef]
  24. Yiu, M.L.; Lo, E.; Yung, D. Authentication of moving kNN queries. In Proceedings of the 2011 IEEE 27th International Conference on Data Engineering, Hannover, Germany, 11–16 April 2011; pp. 565–576. [Google Scholar] [CrossRef]
  25. Rong, H.; Wang, H.; Liu, J.; Wu, W.; Xian, M. Efficient Integrity Verification of Secure Outsourced kNN Computation in Cloud Environments. In Proceedings of the 2016 IEEE Trustcom/BigDataSE/ISPA, Tianjin, China, 23–26 August 2016; pp. 236–243. [Google Scholar] [CrossRef]
  26. Jiang, S.; Zhu, X.; Guo, L.; Liu, J. Publicly Verifiable Boolean Query Over Outsourced Encrypted Data. IEEE Trans. Cloud Comput. 2019, 7, 799–813. [Google Scholar] [CrossRef]
  27. Wu, S.; Li, Q.; Li, G.; Yuan, D.; Yuan, X.; Wang, C. ServeDB: Secure, Verifiable, and Efficient Range Queries on Outsourced Database. In Proceedings of the 2019 IEEE 35th International Conference on Data Engineering (ICDE), Macao, China, 8–11 April 2019; pp. 626–637. [Google Scholar] [CrossRef]
  28. Liu, X.; Deng, R.H.; Choo, K.K.R.; Weng, J. An Efficient Privacy-Preserving Outsourced Calculation Toolkit With Multiple Keys. IEEE Trans. Inf. Forensics Secur. 2016, 11, 2401–2414. [Google Scholar] [CrossRef]
  29. Yi, X.; Paulet, R.; Bertino, E.; Varadharajan, V. Practical Approximate k Nearest Neighbor Queries with Location and Query Privacy. IEEE Trans. Knowl. Data Eng. 2016, 28, 1546–1559. [Google Scholar] [CrossRef]
  30. Benabbas, S.; Gennaro, R.; Vahlis, Y. Verifiable Delegation of Computation over Large Datasets. In Proceedings of the Advances in Cryptology—CRYPTO 2011, Santa Barbara, CA, USA, 14–18 August 2011; Rogaway, P., Ed.; Springer: Berlin/Heidelberg, Germany, 2011; pp. 111–131. [Google Scholar]
  31. Gennaro, R.; Gentry, C.; Parno, B. Non-interactive Verifiable Computing: Outsourcing Computation to Untrusted Workers. In Proceedings of the Advances in Cryptology—CRYPTO 2010, Santa Barbara, CA, USA, 15–19 August 2010; Rabin, T., Ed.; Springer: Berlin/Heidelberg, Germany, 2010; pp. 465–482. [Google Scholar]
  32. Parno, B.; Raykova, M.; Vaikuntanathan, V. How to delegate and verify in public: Verifiable computation from attribute-based encryption. In Proceedings of the 9th International Conference on Theory of Cryptography, Sicily, Italy, 19–21 March 2012; Springer: Berlin/Heidelberg, Germany, 2012. TCC’12. pp. 422–439. [Google Scholar] [CrossRef]
  33. Wang, Q.; Zhou, F.; Zhou, B.; Xu, J.; Chen, C.; Wang, Q. Privacy-Preserving Publicly Verifiable Databases. IEEE Trans. Dependable Secur. Comput. 2022, 19, 1639–1654. [Google Scholar] [CrossRef]
  34. Liu, J.; Zhang, L.F. Privacy-Preserving and Publicly Verifiable Matrix Multiplication. IEEE Trans. Serv. Comput. 2023, 16, 2059–2071. [Google Scholar] [CrossRef]
  35. Paillier, P. Public-Key Cryptosystems Based on Composite Degree Residuosity Classes. In Advances in Cryptology—EUROCRYPT ’99; Stern, J., Ed.; Springer: Berlin/Heidelberg, Germany, 1999; pp. 223–238. [Google Scholar]
  36. National Institute of Standards and Technology. FIPS-186–3 FIPS 186-3, Digital Signature Standard (DSS)-NIST CSRC. Available online: https://csrc.nist.gov/files/pubs/fips/186-3/final/docs/fips_186-3.pdf (accessed on 10 April 2025).
  37. Okabe, A.; Boots, B.; Sugihara, K. Spatial Tessellations: Concepts and Applications of Voronoi Diagrams; John Wiley & Sons, Inc.: Hoboken, NJ, USA, 1992. [Google Scholar]
  38. Kolahdouzan, M.; Shahabi, C. Voronoi-based K nearest neighbor search for spatial network databases. In Proceedings of the Thirtieth International Conference on Very Large Data Bases-Volume 30. VLDB Endowment, 2004, VLDB ’04, Toronto, ON, Canada, 30 August–3 September 2004; pp. 840–851. [Google Scholar]
  39. Liu, A.; Zhengy, K.; Liz, L.; Liu, G.; Zhao, L.; Zhou, X. Efficient secure similarity computation on encrypted trajectory data. In Proceedings of the 2015 IEEE 31st International Conference on Data Engineering, Seoul, Republic of Korea, 13–17 April 2015; pp. 66–77. [Google Scholar] [CrossRef]
  40. Liu, J.; Yang, J.; Xiong, L.; Pei, J. Secure Skyline Queries on Cloud Platform. In Proceedings of the 2017 IEEE 33rd International Conference on Data Engineering (ICDE), San Diego, CA, USA, 19–22 April 2017; pp. 633–644. [Google Scholar] [CrossRef]
  41. Yao, A.C.C. How to generate and exchange secrets. In Proceedings of the 27th Annual Symposium on Foundations of Computer Science (focs 1986), Toronto, ON, Canada, 27–29 October 1986; pp. 162–167. [Google Scholar] [CrossRef]
Figure 1. The system model.
Figure 1. The system model.
Modelling 06 00044 g001
Figure 2. An example of ED = ( E ( I 1 ) , E ( I 2 ) ) with the grid granularity m = 2 , E ( I 1 ) = E G s t | s , t = 0 , 1 , and E ( I 2 ) = B j | j = 0 , 1 , 2 , 3 . The points with red are padding records.
Figure 2. An example of ED = ( E ( I 1 ) , E ( I 2 ) ) with the grid granularity m = 2 , E ( I 1 ) = E G s t | s , t = 0 , 1 , and E ( I 2 ) = B j | j = 0 , 1 , 2 , 3 . The points with red are padding records.
Modelling 06 00044 g002
Figure 3. Time cost comparison for varying dataset sizes n.
Figure 3. Time cost comparison for varying dataset sizes n.
Modelling 06 00044 g003
Figure 4. Time cost comparison for varying grid granularities m.
Figure 4. Time cost comparison for varying grid granularities m.
Modelling 06 00044 g004
Figure 5. Time cost comparison for varying query parameters k.
Figure 5. Time cost comparison for varying query parameters k.
Modelling 06 00044 g005
Figure 6. Time cost comparison for varying key sizes K.
Figure 6. Time cost comparison for varying key sizes K.
Modelling 06 00044 g006
Figure 7. Communication cost comparison for CS 1 CS 2 in the Search stage.
Figure 7. Communication cost comparison for CS 1 CS 2 in the Search stage.
Modelling 06 00044 g007
Figure 8. Communication latency for CS 1 CS 2 in the Search stage.
Figure 8. Communication latency for CS 1 CS 2 in the Search stage.
Modelling 06 00044 g008
Table 1. Comparison of system models and security properties of existing kNN schemes.
Table 1. Comparison of system models and security properties of existing kNN schemes.
SchemePrivacyVerifiabilitykNNSystem Model
Dataset Query Result Access Patterns Private Public Appro Exact
Wong et al. [11]×××××××1 server
Hu et al. [12]×××××××1 server
Yao et al. [13]××××1 server
Choi et al. [14]××××1 server
Zhu et al. [17,18]××××1 server
Yi [29]××××1 server
Lei et al. [2]××××1 server
Lei et al. [19]××××1 server
Li et al. [20]××××1 server
Zheng et al. [21]××××1 server
Elmehdwi et al. [22]×××2 servers
Guan et al. [23]×××2 servers
Qi et al. [5]××××2 servers
Yiu et al. [24]××××××1 server
Rong et al. [25]××××2 servers
Sundarapandi et al. [7]××2 servers
Liu et al. [3]×××3 servers (2 clouds + 1 edge)
Zhang et al. [4]★★××2 servers + 1 KGC
Cui et al. [9]××2 servers
Cui et al. [8]××2 servers + 1 CA
Ours××2 servers
Regarding verifiability, ‘✓’ indicates that the verification approach for query results is probabilistic, while ‘✓★★’ signifies that the scheme only supports verifying whether the query results returned by the cloud correspond to the authentic data uploaded by the DO.
Table 2. Notations.
Table 2. Notations.
NotationDescription
ρ a permutation function
D the plaintext spatial dataset
ED the ciphertext spatial dataset
L C M ( a , b ) the least common multiple of two integers a and b
Q Q = ( x q , y q ) is the query data point
i d min ( j ) the index of the jth nearest neighbor to Q
P i d min ( j ) P i d min ( j ) = ( x i d min ( j ) , y i d min ( j ) ) denotes the jth
nearest neighbor to Q
V R V ( P i d min ( j ) ) the set of Voronoi-relevant vectors of P i d min ( j )
H ( · ) a cryptographic hash function
Sig ( · ) a DSA signature
Na large integer that is the product of two
prime numbers p and q
Z N 2 the multiplicative group of the residue
class modulo N 2
Z N the residue class ring modulo N
x | | y the concatenation of two numbers x and y
x the greatest integer no larger than x
[ n ] the set { 0 , , n 1 }
[ m ] the set { 0 , , m 1 }
negl ( · ) a negligible function of some input parameter
Table 3. Computational cost of each algorithm.
Table 3. Computational cost of each algorithm.
Entities CS 1 CS 2
Algorithm
Algorithm 1 O ( ( k λ + k n w ) τ + ( m 2 + k n ) log N ) Mul s
+ O ( m 2 + k n ) Enc s
O ( k n w ) Dec s + 2 Div s +
O ( k λ ) Mul s + O ( k ) Enc s
Algorithm 2 O ( τ + log N ) Mul s 4 Dec s + 2 Div s + 2 Enc s
Algorithm 3 O ( m 2 log N ) Mul s + O ( m 2 ) Enc s 2 Enc s + 2 Dec s
Algorithm 4 O ( λ σ + log N ) Mul s + 3 Enc s 3 Dec s + 5 Enc s + 2 λ Mul s
Algorithm 5 O ( n w τ + n log N ) Mul s + O ( n ) Enc s O ( n w ) Dec s + 2 Enc s
Table 4. Communication cost of each algorithm (unit: bits).
Table 4. Communication cost of each algorithm (unit: bits).
Entities CS 1 CS 2 CS 2 CS 1 CS 1 QU CS 2 QU
Algorithm
Algorithm 1 O ( ( m 2 + k n ) log N + log m ) O ( ( m 2 + k λ + k n ) log N ) O ( k τ ) O ( k log N )
Algorithm 2 O ( log N ) O ( log N )
Algorithm 3 O ( m 2 log N + log m ) O ( m 2 log N )
Algorithm 4 O ( log N ) O ( λ log N )
Algorithm 5 O ( n w log N + n log N ) O ( n log N )
Table 5. Comparison of computational costs between our protocol and Cui et al.’s protocol.
Table 5. Comparison of computational costs between our protocol and Cui et al.’s protocol.
ProtocolCui et al.’s Protocol [8]Our Protocol
Stages Setup DSEnc QUEnc Search Verify
ResDec
Setup DSEnc QUEnc Search Verify
ResDec
Entities
DO ( 2 m 2 + 4 n +
2 ) Enc s +
O ( n ) Hash s
O ( log p ) Muls ( m 2 + 3 n +
2 ) Enc s + n Sig s
CS 1 O ( ( m 2 + t m
+ k ( λ + n ) ) τ + ( t m
+ k n ) log N ) Mul s
+ O ( m 2 + k n ) Enc s
O ( ( k λ + k n w ) τ
+ ( m 2 + k n ) log N ) Mul s
+ O ( m 2 + k n ) Enc s
CS 2 2 Div s +
+ O ( k ) Enc s +
O ( k n w ) Dec s
2 Div s +
O ( k ) Enc s +
O ( k n w ) Dec s
QU 6 Enc s O ( 2 k ) Dec s
+ O ( k 2 ) Hash s
2 Enc s O ( k ) Hash s +
O ( k ) Ver s
Table 6. Comparison of communication costs between our protocol and Cui et al.’s protocol (unit: bits).
Table 6. Comparison of communication costs between our protocol and Cui et al.’s protocol (unit: bits).
ProtocolCui et al.’s Protocol [8]Our Protocol
Stages DSEnc QUEnc Search DSEnc QUEnc Search
Entities
CA DO O ( log N )
CA CS 1 O ( log N )
CA CS 2 O ( log N )
CA QU O ( log N )
DO CS 1 O ( ( m 2 + n ) log N ) O ( ( m 2 + n ) log N )
DO CS 2 O ( log N )
QU CS 1 O ( log N ) O ( log N )
CS 1 CS 2 O ( ( m 2 + k n ) log N
+ log T )
O ( ( m 2 + k n ) log N + log m )
CS 2 CS 1 O ( ( m 2 + k λ + k n ) log N ) O ( ( m 2 + k λ + k n ) log N )
CS 1 QU O ( k log N ) O ( k τ )
CS 2 QU O ( k log N )
Table 7. Time cost comparison on synthesized datasets with different sizes of 1000, 5000, 10,000, and 20,000 (unit: seconds).
Table 7. Time cost comparison on synthesized datasets with different sizes of 1000, 5000, 10,000, and 20,000 (unit: seconds).
ProtocolCui et al.’s Protocol [8]Our Protocol
Dataset Size  n 1000 5000 10,000 20,000 1000 5000 10,000 20,000
Stages
Setup 1.448 1.181 1.211 1.156 0.312 0.544 0.279 0.497
DSEnc 97.933 723.832 1458.838 3658.052 47.716 337.935 727.206 2046.939
QUEnc 0.058 0.057 0.056 0.059 0.022 0.022 0.023 0.023
Search 58.655 303.406 598.363 1388.696 8.287 16.914 24.610 47.330
Verify and ResDec 0.160 0.534 0.602 0.722 0.005 0.005 0.006 0.008
Total 158.254 1029.010 2059.070 5048.685 56.342 355.420 752.124 2094.797
Table 8. Time cost comparison with different grid granularities on a synthesized dataset of size 2000 (unit: seconds).
Table 8. Time cost comparison with different grid granularities on a synthesized dataset of size 2000 (unit: seconds).
ProtocolCui et al.’s Protocol [8]Our Protocol
Grid Granularity  m 4 8 12 16 32 64 4 8 12 16 32 64
Stages
Setup 1.0091.1881.4621.6921.9141.6870.2910.2850.2400.2070.4630.871
DSEnc 181.751182.137184.899172.719218.232225.19286.29885.20290.33792.901103.423115.173
QUEnc 0.0570.0580.0560.0560.0580.0550.0230.0220.0230.0230.0230.024
Search 114.947102.155110.785120.766133.233137.68011.14710.25510.81111.17511.70812.942
Verify and ResDec 0.2430.2210.2300.2050.2770.2510.0040.0040.0050.0040.0050.005
Total 298.007 285.759 297.432 309.036 353.714 364.865 97.763 95.768 101.456 104.420 115.622 129.015
Table 9. Time cost comparison with different query parameters k on a synthesized dataset of size 2000 (unit: seconds).
Table 9. Time cost comparison with different query parameters k on a synthesized dataset of size 2000 (unit: seconds).
ProtocolCui et al.’s Protocol [8]Our Protocol
Query Parameter  k 1 3 5 7 9 1 3 5 7 9
Stages
Setup 1.3171.4021.6921.5661.4770.2860.2560.3170.2790.263
DSEnc 180.479185.852186.317178.708183.61789.62388.04492.90191.07592.367
QUEnc 0.0570.0580.0560.0570.0570.0230.0220.0230.0220.023
Search 37.26370.774120.766201.474261.2212.8276.85511.17516.41121.933
Verify and ResDec 0.0310.0940.2050.2580.3220.0020.0030.0040.0080.013
Total 219.147 258.180 309.036 382.063 446.694 92.761 95.180 104.420 107.795 114.599
Table 10. Time cost comparison with different key sizes on a synthesized dataset of size 2000 (unit: seconds).
Table 10. Time cost comparison with different key sizes on a synthesized dataset of size 2000 (unit: seconds).
ProtocolCui et al.’s Protocol [8]Our Protocol
Key Size  K 512 1024 2048 512 1024 2048
Stages
Setup 0.349 1.192 6.582 0.192 0.317 0.535
DSEnc 36.517 172.719 867.321 20.978 92.901 494.262
QUEnc 0.009 0.056 0.267 0.004 0.023 0.154
Search 24.160 120.766 650.312 2.092 11.175 53.270
Verify and ResDec 0.055 0.205 1.067 0.005 0.004 0.006
Total 61.090 309.036 1525.549 23.271 104.420 548.227
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Li, J.; Song, Y.; Tian, C.; Tian, W. PVkNN: A Publicly Verifiable and Privacy-Preserving Exact kNN Query Scheme for Cloud-Based Location Services. Modelling 2025, 6, 44. https://doi.org/10.3390/modelling6020044

AMA Style

Li J, Song Y, Tian C, Tian W. PVkNN: A Publicly Verifiable and Privacy-Preserving Exact kNN Query Scheme for Cloud-Based Location Services. Modelling. 2025; 6(2):44. https://doi.org/10.3390/modelling6020044

Chicago/Turabian Style

Li, Jingyi, Yuqi Song, Chengliang Tian, and Weizhong Tian. 2025. "PVkNN: A Publicly Verifiable and Privacy-Preserving Exact kNN Query Scheme for Cloud-Based Location Services" Modelling 6, no. 2: 44. https://doi.org/10.3390/modelling6020044

APA Style

Li, J., Song, Y., Tian, C., & Tian, W. (2025). PVkNN: A Publicly Verifiable and Privacy-Preserving Exact kNN Query Scheme for Cloud-Based Location Services. Modelling, 6(2), 44. https://doi.org/10.3390/modelling6020044

Article Metrics

Back to TopTop