Next Article in Journal
Data Management in Smart Manufacturing Supply Chains: A Systematic Review of Practices and Applications (2020–2025)
Next Article in Special Issue
A Traceable Ring Signcryption Scheme Based on SM9 for Privacy Protection
Previous Article in Journal
TA-LJP: Term-Aware Legal Judgment Prediction
Previous Article in Special Issue
OTSU-UCAN: An OTSU-Based Integrated Satellite–Terrestrial Information System for 6G in Vehicle Navigation
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Secure Streaming Data Encryption and Query Scheme with Electric Vehicle Key Management

1
School of Software, Northeastern University, Shenyang 110819, China
2
State Grid Smart Internet of Vehicles Technology Co., Ltd., Beijing 100032, China
3
Electric Power Research Institute, State Grid Hubei Electric Power Company, Wuhan 430013, China
*
Author to whom correspondence should be addressed.
Information 2026, 17(1), 18; https://doi.org/10.3390/info17010018
Submission received: 8 November 2025 / Revised: 5 December 2025 / Accepted: 16 December 2025 / Published: 25 December 2025
(This article belongs to the Special Issue Privacy-Preserving Data Analytics and Secure Computation)

Abstract

The rapid proliferation of Electric Vehicle (EV) infrastructures has led to the massive generation of high-frequency streaming data uploaded to cloud platforms for real-time analysis, while such data supports intelligent energy management and behavioral analytics, it also encapsulates sensitive user information, the disclosure or misuse of which can lead to significant privacy and security threats. This work addresses these challenges by developing a secure and scalable scheme for protecting and verifying streaming data during storage and collaborative analysis. The proposed scheme ensures end-to-end confidentiality, forward security, and integrity verification while supporting efficient encrypted aggregation and fine-grained, time-based authorization. It introduces a lightweight mechanism that hierarchically organizes cryptographic keys and ciphertexts over time, enabling privacy-preserving queries without decrypting individual data points. Building on this foundation, an electric vehicle key management and query system is further designed to integrate the proposed encryption and verification scheme into practical V2X environments. The system supports privacy-preserving data sharing, verifiable statistical analytics, and flexible access control across heterogeneous cloud and edge infrastructures. Analytical and experimental evidence show that the designed system attains rigorous security guarantees alongside excellent efficiency and scalability, rendering it ideal for large-scale electric vehicle data protection and analysis tasks.

Graphical Abstract

1. Introduction

Driven by the digitalization of transportation and the growing penetration of Electric Vehicles (EVs), energy networks are becoming increasingly dependent on real-time vehicular data. The nationwide rollout of charging facilities, expected to exceed 10.6 million units by mid-2024, provides the foundation for intelligent operations such as V2G coordination, demand-response scheduling, and renewable energy integration [1]. Unlike conventional periodic data collection, modern EV systems continuously generate and transmit high-frequency charging data at second-level or even millisecond-level granularity, forming a typical real-time streaming data flow. These data streams record multi-dimensional information, such as charging location, duration, power usage, and driving behavior, across various temporal and spatial domains. Consequently, they contain highly sensitive personal and operational information, including user mobility traces, energy consumption patterns, and behavioral profiles. To support large-scale analytics, model training, and personalized service optimization, such streaming data are often outsourced to cloud or edge servers for long-term storage and computation.

1.1. Motivation

Despite their analytical value, the outsourcing and cross-platform sharing of high-frequency streaming data introduce severe privacy and security risks [2]. Unauthorized access or data correlation may enable adversaries to infer user identities, travel routes, and behavioral habits, leading to substantial privacy leakage. Therefore, it is essential to perform encryption before outsourcing and to enforce access control with fine granularity, traceability, and revocability, ensuring secure and auditable collaboration across multiple stakeholders.
However, encrypting continuous and high-volume streaming data introduces several fundamental challenges. Each data point generated in real time, in principle, requires a distinct cryptographic key, which leads to exponential key growth and significant management overhead. When data consumers are authorized to access information within specific time intervals, conventional key assignment methods often necessitate distributing a large number of keys simultaneously, thereby reducing scalability and increasing computational cost. In addition, supporting real-time analytics or aggregation over encrypted streams demands not only data confidentiality but also verifiable integrity and query efficiency, which traditional encryption mechanisms are unable to achieve concurrently. The concept of deriving per-block encryption keys from a unified master key using PRFs [3,4] has been widely examined as a means to alleviate the challenge of large-scale key storage. Although this technique mitigates part of the management complexity, it remains insufficient for practical scenarios that require both fine-grained access control and efficient, continuous encryption of large-scale streaming data.
To overcome these challenges, we propose a comprehensive secure streaming encryption and query scheme tailored for EV data environments. It allows the data owner to control access granularity flexibly, ensuring that different consumers can obtain only the necessary information within their authorized time windows. The proposed scheme thus provides a scalable, lightweight, and privacy-preserving solution for secure data outsourcing and real-time analytics in cross-platform EV charging and V2G scenarios.

1.2. Contributions

Motivated by the need for scalable, efficient, and privacy-preserving protection of continuous EV data streams, this work contributes four core advancements. It is an extended version of our conference paper presented at DSPP [5], which has been substantially enhanced by incorporating a formal security proof, complete key management, and a comprehensive performance evaluation.
  • We construct a scalable hierarchical key management framework by integrating a Key Derivation Tree (KDT) with a Regression Key Chain (RKR), thereby ensuring forward security across temporal data segments. All encryption and authentication keys are derived deterministically from a compact root key, solving the key explosion problem and reducing storage and transmission overhead.
  • To secure continuous data streams, we introduce a symmetric additive homomorphic encryption (SAHE) scheme that permits efficient aggregation operations on ciphertexts, providing IND-CPA confidentiality based on the pseudorandom function (PRF) assumption.
  • We propose a HomMAC-based integrity scheme that supports algebraic verification for per-chunk and aggregated ciphertexts, ensuring correctness throughout encrypted streams with very low verification cost.
  • We implement a Ciphertext Index Tree (CIT) to support encrypted statistical queries and scalable aggregation. The proposed system integrates KDT, RKR, SAHE, and HomMAC, achieving low latency and strong scalability in both server and edge environments compared with EC-ElGamal and Paillier baselines.
The remainder of this paper is organized as follows. Section 2 reviews related work on encrypted stream processing and key management. Section 3 introduces the necessary preliminaries, including the symmetric additive homomorphic encryption (SAHE), pseudorandom functions, and AEAD. Section 4 presents the overall system framework and formal definition of the secure streaming data encryption and query scheme (SSEDQ). Section 5 details the construction and security analysis of SSEDQ. Section 6 describes the electric vehicle key management and query system, including the Ciphertext Index Tree (CIT) and privacy-preserving access policies. Section 7 reports the experimental results and performance analysis on both server and edge platforms. Finally, Section 8 concludes the paper and discusses directions for future work.

2. Related Work

2.1. Encrypted Stream Query Processing

Encryption has become an essential mechanism for safeguarding the confidentiality of streaming data. In recent years, researchers have extensively investigated how to support efficient query processing over encrypted streams without compromising data utility. Liu et al. [6] presented a secure MPC-based protocol that enables DTW queries on encrypted time series. However, the inherent computational complexity of DTW increases with stream length, limiting the scalability of their approach. To improve performance, Liu et al. [7] addressed this performance-security tension by designing PP-Stream, a distributed stream processing system for high-performance privacy-preserving neural network inference.
To further enhance privacy and efficiency, Fang et al. [8] designed a lightweight privacy-preserving data sharing scheme using secret sharing and aggregation, effectively mitigating plaintext leakage during transmission. Wang et al. [9] integrated AES-GCM encryption with trusted execution environments (TEEs) to construct secure Hoeffding tree models for encrypted training and inference. Similarly, Zheng et al. [10] proposed a method supporting similarity search over encrypted temporal streams based on Time-Warp Edit Distance (TWED) and KD-tree indexing under additive encryption. Wang [11] designed GraphGuard, the earliest system that realizes privacy-preserving outsourced detection of temporal patterns in streaming graphs, thus advancing secure analytics toward more dynamic data contexts.
Beyond deterministic encryption, several studies have explored querying over uncertain or incomplete encrypted streams. Chen et al. [12] introduced a probabilistic stream index for managing uncertain data, while Miao et al. [13] investigated privacy-preserving multi-pattern subset subgraph matching. Bai et al. [14] developed skyline query techniques for incomplete streams using range-based indexing and pruning optimizations. In addition, Guan et al. [15] achieved secure maximum aggregation over encrypted streams via Paillier homomorphic encryption combined with block-structured computation. Ouyang [16] advanced privacy-preserving temporal analytics through an efficient and oblivious framework. More recently, Safaee et al. [17] presented StreamFilter, a distributed range query system incorporating role-based access control, although it does not directly conceal the underlying data values. Despite this rich body of work on encrypted stream processing, most existing schemes were not designed for high-frequency, long-term EV telemetry. MPC-based and homomorphic encryption protocols [6,13,15] often incur substantial computational and communication overhead, which limits their practicality for second-level or millisecond-level data streams. TEE-based approaches [9] reduce cryptographic cost but shift trust to additional hardware and usually do not offer cryptographic forward security. Furthermore, many existing methods either focus on similarity search or pattern detection over fixed windows [10,11,16] and do not provide a unified treatment of continuous streaming encryption, ciphertext-domain aggregation, and verifiable integrity for large-scale statistical analytics. In particular, prior systems typically assume static keys or coarse-grained access policies and lack hierarchical, time-based key derivation that can support flexible authorization over arbitrary time intervals in EV scenarios.

2.2. Key Management and Access Control

Beyond encrypted query processing, effective key management and access control are equally essential to ensure secure data outsourcing. Vimercati et al. [18] introduced the concept of Over-encryption, which enforces attribute-based policies by mapping access privileges to hierarchical key trees using symmetric cryptography. Qi et al. [19] proposed CryptDAC, a revocable encryption scheme that maintains file–key associations for dynamic policy enforcement; however, it suffers from high revocation overhead and complex key hierarchy maintenance. Tong et al. [20] developed a secure k-nearest neighbor (k-NN) query scheme supporting multi-user delegation and verifiable result retrieval.
To achieve finer-grained access control, several studies have adopted ABE mechanisms [21,22,23,24], while these schemes enable expressive policy enforcement, they typically require pre-distributed attributes and are not optimized for dynamic, time-dependent query scenarios. Hua et al. [25] proposed a hybrid proxy re-encryption model that transforms ABE ciphertexts into identity-based encryption (IBE) format, thereby protecting lower-layer decryption details from the proxy. Building upon this, Xiong et al. [26] and Ge et al. [27] presented revocable proxy-assisted encryption schemes that allow key revocation without regenerating user credentials. Hierarchical key generation approaches based on data structures [28,29,30] have also been investigated to manage large-scale key hierarchies efficiently. However, such methods often introduce substantial overhead in metadata labeling and key revocation, making them less practical for large, continuously evolving datasets. Existing key management and access-control frameworks are mostly tailored to file-oriented or attribute-oriented data outsourcing, where data items are relatively static and access policies change infrequently. ABE- and proxy re-encryption-based schemes [21,22,23,24,25,26,27] provide expressive policies but suffer from high computation and communication overhead, and revocation typically requires updating many ciphertexts or keys. Tree-based key hierarchies [28,29,30] improve scalability but do not natively address forward security or aggregated queries over continuous data streams. Very few works consider streaming settings in which keys must evolve while supporting both raw and aggregate access for different consumers with verifiable correctness. In contrast, our design combines a GGM-style Key Derivation Tree and hash-based key regression with symmetric additive homomorphic encryption and HomMAC, providing fine-grained, time-based authorization, forward-secure key evolution, and aggregation tailored to EV streaming data.

3. Preliminaries

3.1. Efficient Symmetric Encryption with Additive Homomorphism

The SAHE scheme is a lightweight symmetric encryption method that supports additive homomorphism. SAHE is well-suited for environments with limited computational and energy resources, such as wireless sensor networks. We operate all additive homomorphic encryption directly in the prime field Z p . For plaintexts m 1 , m 2 Z p and ciphertexts c 1 = Enc ( m 1 ) and c 2 = Enc ( m 2 ) , additive homomorphism holds:
c 1 + c 2 = ( m 1 + k 1 + m 2 + k 2 ) mod p .

3.2. Pseudorandom Functions and Key Derivation

WWe model key material using a secure pseudorandom function (PRF) F : { 0 , 1 } λ × { 0 , 1 } * Z p , indistinguishable from a truly random function by any PPT adversary. Following the GGM paradigm, a seed at the tree root expands into per-leaf keys through a binary tree of PRG/PRF evaluations. For domain separation and context binding, we use HKDF to derive independent subkeys from a pseudorandom input K:
HKDF - Expand ( K , info ) K info ,
where info is a public label (e.g., info AEAD , info enc , info mac ). Different labels yield computationally independent outputs even under key reuse in the extract phase. In our construction, (i) per-chunk keys k i are derived from the KDT leaf associated with the time window t w i ; (ii) p d k i = HKDF ( k i , info AEAD ) is used for AEAD; (iii) per-point encryption and MAC subkeys ( s k i , j , m a c k i , j ) are derived via HKDF keyed by a one-way hash chain state, ensuring time isolation and forward security.

3.3. AEAD with Associated Data and Nonce Discipline

We employ an authenticated encryption with associated data (AEAD) primitive, instantiated as AES-GCM, to protect each chunk c h u n k i . AEAD takes as input a key p d k i , a plaintext M i , a nonce nonce i , and associated data aad i , and outputs ( C i , tag i ) :
( C i , tag i ) AEAD . Enc p d k i ( M i ; nonce i , aad i ) , M i AEAD . Dec p d k i ( C i , tag i ; nonce i , aad i ) .
Security requires (i) nonce uniqueness for each ( p d k i , nonce i ) pair and (ii) binding of metadata through aad i . AEAD provides indistinguishability under chosen-plaintext attacks and ciphertext integrity; the aad i binding prevents cross-context replay or splicing across windows or devices. This layer complements SAHE-based linear aggregation by protecting the raw chunk payload and headers end-to-end.

4. System Framework

Figure 1 depicts the overall framework of our secure and privacy-preserving data collaboration model for vehicle-to-grid (V2G) applications. The proposed secure streaming data encryption and query scheme consists of four primary entities: the Data Owner (DO), the Data Server (DS), the Data Consumer (DC), and the Authorization Manager (AM). Each entity is assigned distinct cryptographic or control-plane responsibilities, collectively ensuring secure data generation, encrypted storage, privacy-preserving query processing, and fine-grained access control across the system.
  • Data Owner (DO): The Data Owner is the entity responsible for generating and controlling the raw electric vehicle (EV) telemetry and charging stream data. The DO continuously collects high-frequency measurements and segments them into fixed-size time windows, transforming the data stream into temporally ordered chunks for encryption. It initializes the master secret and derives all subordinate encryption and authentication keys through a hierarchical Key Derivation Tree (KDT) combined with a one-way hash regression chain, thereby achieving forward secrecy and temporal key isolation. For each data chunk, the DO performs authenticated encryption and produces corresponding homomorphic authentication tags before uploading the resulting ciphertext packages to the Data Server for outsourced storage. The DO retains exclusive control of the root key and serves as the sole authority for key delegation. Depending on access policies, it selectively grants fine-grained decryption capabilities by releasing the minimal subset of KDT nodes for raw data access or provides only boundary regression keys for aggregated analytics, ensuring privacy-preserving and policy-compliant data sharing.
  • Data Server (DS): The Data Server functions as an untrusted or semi-trusted storage and computation provider. It is responsible for storing uploaded data and maintaining a dynamic Ciphertext Index Tree (CIT) that hierarchically organizes encrypted statistical vectors in chronological order. When new encrypted chunks arrive, the DS updates the CIT and performs ciphertext-domain aggregation at higher levels to maintain compactness and support efficient query processing. Upon receiving a query request over a specific time interval from a Data Consumer, the DS locates the minimal set of CIT nodes covering the requested range and computes the aggregated ciphertext result, which is then returned to the consumer. Throughout this process, the DS neither possesses any decryption keys nor learns any plaintext information; it operates under the honest-but-curious assumption and serves purely as a custodial and computational node enabling scalable, privacy-preserving data outsourcing.
  • Data Consumer (DC): The Data Consumer is an authorized entity, such as a grid operator, V2G aggregator, or analytics provider, that queries encrypted EV data for analysis and decision-making. Depending on the access rights granted by the Data Owner (DO), the DC may receive either the Key Derivation Tree (KDT) node subset N [ t s , t e ] for fine-grained raw data decryption or the boundary regression keys ( s k s t a r t , s k e n d ) for aggregated analytics. With these delegated keys, the DC can decrypt the corresponding ciphertexts and verify their integrity using the homomorphic MAC mechanism. After successful decryption and verification, the DC reconstructs statistical indicators such as mean, variance, and standard deviation, enabling privacy-preserving analytics and decision support without revealing any individual EV records.
  • Authorization Manager (AM): The Authorization Manager neither generates nor processes any secret keys or ciphertexts. Instead, the AM operates at the control layer to enforce data-sharing policies and coordinate key-release procedures between the DO and the DC. When a DC requests access to a specific time range [ t s , t e ] , the AM verifies the request against predefined access policies and triggers the corresponding key-sharing algorithm on behalf of the DO, either RawShare ( [ t s , t e ] ) for fine-grained access or AggShare ( [ t s , t e ] ) for aggregated access. The resulting outputs N [ t s , t e ] or ( s k s t a r t , s k e n d ) are generated by the DO and securely transmitted to the DC via the AM. The AM also records each authorization event for accountability, but cannot compute or derive any cryptographic materials itself.
Definition 1.
A secure streaming data encryption and query scheme SSEDQ consists of a tuple of seven algorithms ( Setup , KeyGen , Encrypt , Decrypt , Verify , RawShare , AggShare ) , defined as follows:
Setup ( Δ t , M ) ( t w i , c h u n k i ) . The DO executes the Setup . It takes as input the fixed time window Δ t and message space bound M, continuously collects the streaming data, and partitions it into temporally ordered chunks { c h u n k i } according to Δ t .
KeyGen ( k root , i , λ ) ( k i , p d k i , { s k i , j } j , { m a c k i , j } j ) . In this phase, the DO performs the KeyGen . Given the master secret key k root , the chunk index i, and the security parameter λ, the algorithm derives a unique per-chunk key k i , an associated AEAD encryption key p d k i , and two corresponding sequences of subkeys { s k i , j } j and { m a c k i , j } j , which are used for encryption and message authentication, respectively.
Encrypt ( c h u n k i , p d k i , { s k i , j } j , { m a c k i , j } j ) c c i . The DO executes the Encrypt . It takes as input a data chunk c h u n k i , the AEAD encryption key p d k i , and the sets of encryption and authentication subkeys { s k i , j } j and { m a c k i , j } j , and outputs the ciphertext package c c i that includes both the encrypted data and its homomorphic authentication tags.
Decrypt ( s v c i , { s k i , j } j ) ( { v i , j } j , v a g g ) . The DC executes the Decrypt . It takes as input the ciphertext vector s v c i and the corresponding decryption subkeys { s k i , j } j , and outputs either the recovered plaintext values { v i , j } or the aggregated plaintext v a g g over a specified interval.
Verify ( c , σ , m a c k start , m a c k end ) { 0 , 1 } . The DC executes the Verify . It takes as input a ciphertext–tag pair ( c , σ ) and the corresponding authentication keys ( m a c k start , m a c k end ) , and outputs 1 if the homomorphic MAC verification equation holds or 0 otherwise.
RawShare ( [ t s , t e ] ) N [ t s , t e ] . The RawShare is executed by the DO or coordinated by the AM. It takes as input a query interval [ t s , t e ] and outputs the minimal subset of KDT nodes N [ t s , t e ] whose subtrees cover all chunk keys within this interval, enabling authorized reconstruction of per-chunk decryption keys.
AggShare ( [ t s , t e ] ) ( s k s t a r t , s k e n d ) . The AggShare is executed by the DO or coordinated by the AM. It takes as input a query interval [ t s , t e ] and outputs the boundary regression keys ( s k s t a r t , s k e n d ) derived from the one-way hash chain, which allow aggregate decryption without exposing any individual plaintext records.

Correctness and Security Definitions

Definition 2
(Correctness). For any data point v i , j with corresponding ciphertext c i , j and MAC keys mack i , j , mack i , j + 1 derived from the key generation process, the secure streaming data encryption and query scheme SSEDQ satisfies correctness if
Pr ( t w i , c h u n k i ) Setup ( Δ t , M ) , ( k i , p d k i , { s k i , j } j , { mack i , j } j ) KeyGen ( k root , i , λ ) , c c i = ( s v c i , σ i , AEAD i ) Encrypt ( c h u n k i , p d k i , { s k i , j } j , { mack i , j } j ) , ( { v i , j } j , v a g g ) Decrypt ( s v c i , { s k i , j } j ) , b Verify ( c i , j , σ i , j , mack i , j , mack i , j + 1 ) = 1 ,
where v i , j = v i , j , v a g g = j v i , j mod p , and b = 1 indicates successful verification. The probability is taken over the internal randomness of Setup , KeyGen and Encrypt (e.g., AEAD nonces).
Definition 3.
Consider an execution of SSEDQ that produces a sequence of secret keys ( s k 1 ,   s k 2 ,   ) and ciphertexts ( c 1 ,   c 2 ,   ) for time windows t w 1 ,   t w 2 ,   . We say that SSEDQ achieves forward security if for every PPT adversary Adv and every time index t, the advantage of Adv in recovering any plaintext from a prior time window after compromising the current key s k t is negligible in the security parameter λ:
Adv Adv FS ( λ ) = Pr Adv ( s k t , c t ) v i for some i < t negl ( λ ) ,
where c t = ( c 1 , , c t ) denotes all ciphertexts up to time t. Even if the adversary learns the current key s k t , all data encrypted under keys s k 1 ,   ,   s k t 1 must remain semantically secure.
Definition 4.
Given a security parameter λ, a prime modulus p, and pseudorandom function F, the additive encryption scheme achieves IND-CPA security if every PPT adversary A has only negligible distinguishing advantage between encryptions of chosen messages.
Adv A IND - CPA ( λ ) = Pr [ A wins ] 1 2 negl ( λ ) .
Definition 5.
In the EUF–CMA game for a homomorphic MAC, an adversary A may query an oracle that returns honest ciphertext–tag pairs ( c q , σ q ) for any requested index or interval q. Let S be the set of all such oracle outputs together with all pairs that A can generate from them using only the public homomorphic operation ( c , σ ) ( c , σ ) : = ( c + c , σ + σ ) and A outputs a pair ( q , c , σ ) . We say that A forges if Verify ( c , σ , aux ( q ) ) = 1 and ( c , σ ) S . The advantage of A is
Adv A HomMAC - EUF - CMA ( λ ) = Pr A forges .
The HomMAC scheme is EUF–CMA secure if Adv A HomMAC - EUF - CMA ( λ ) is negligible in λ.

5. Secure Streaming Data Encryption and Query Scheme

This section formalizes a streaming encryption and query scheme that enables confidentiality, forward security, and verifiable analytics over high-frequency EV telemetry. We first specify the scheme algorithms, then prove correctness, and finally establish security guarantees under standard PRF-based assumptions.

5.1. A Detailed Construction

Setup ( Δ t , M ) ( t w i c h u n k i ) . Streaming data are generated continuously in a time-sequential manner. For a given period, each data record is associated with a timestamp, and thus can be denoted as a data point d p i = ( t s i , v i ) , where i is the index of the data point in the stream. Here, t s i denotes the timestamp and v i represents the value of the data point. Therefore, the stream can be formally defined as an ordered set of data points:
S D = { d p 1 ,   d p 2 ,   d p 3 ,   ,   d p i ,   d p i + 1 ,   } .
The Data Owner ( D O ) adopts a fixed time window Δ t = t i + 1 t i . The stream is thus transformed into a sequence of temporally ordered chunks:
C h u n k s = { c h u n k 0 ,   c h u n k 1 ,   ,   c h u n k k ,   } .
Each chunk corresponds to a time window t w i = [ t i , t i + 1 ) and contains all data points within that window:
c h u n k i = { d p i , j = ( t s i , j , v i , j ) t s i , j t w i } .
KeyGen ( k root , i , λ ) ( k i , p d k i , { s k i , j } j , { m a c k i , j } j ) . Let G be a pseudorandom generator that expands a seed of length λ into two child keys. The pseudorandom generator G ( k ) produces a 2 λ -bit output, which is divided into two equal halves, expressed as G ( k ) = G 0 ( k ) G 1 ( k ) . Here, G 0 ( k ) and G 1 ( k ) represent the left and right child keys derived from the seed k, respectively. For any binary sequence x = x 1 x 2 x h { 0 , 1 } h , we recursively define as F ( k , x ) = G x h G x h 1 ( G x 1 ( k ) ) . Let k root denote the root secret at the top of the key derivation tree. For each data chunk indexed by i, the Data Owner (DO) encodes i as an h-bit binary string x ( i ) = x 1 ( i ) x 2 ( i ) x h ( i ) and derives the corresponding leaf key as:
k i = F ( k root , x ( i ) ) = G x h ( i ) G x h 1 ( i ) ( G x 1 ( i ) ( k root ) ) .
Figure 2 illustrates that the root key k root functions as the initial seed in the hierarchy, and each leaf node in the tree represents a distinct data chunk generated through successive key derivations. The height of the tree determines the temporal granularity, and each leaf key k i can be efficiently mapped to its corresponding time window t w i = [ t i , t i + 1 ) . For each data chunk, we derive a unique Authenticated Encryption with Associated Data (AEAD) key p d k i from the corresponding hierarchical key k i using a key derivation function:
p d k i = HKDF ( k i , info AEAD ) ,
where info AEAD is a public context string that labels this key as the AEAD encryption key for chunk i. To achieve time-continuous authorization and forward security, a regression chain is constructed using a one-way hash function:
h 0 = H ( s e e d ) , h i = H ( h i 1 ) , s k i = HKDF ( h i , info enc ) .
Here, HKDF (HMAC-based Key Derivation Function) is employed to expand each pseudorandom value h i into multiple cryptographically independent subkeys, ensuring fine-grained control and key separation across time periods. For data-point or feature-level keying within each chunk, we further derive
s k i , j = HKDF ( h i , j info enc ) , m a c k i , j = HKDF ( h i , j info mac ) .
The parameters info enc and info mac are public, fixed context strings used for domain separation between encryption and authentication keys. They do not contain any secret material but guarantee that the keys derived for different purposes remain cryptographically independent.
Encrypt ( c h u n k i , p d k i , { s k i , j } j , { m a c k i , j } j ) c c i . Before encryption, the data owner ( D O ) assigns each chunk a unique AEAD key p d k i . We instantiate AEAD using the standard AES-GCM mode, which provides both confidentiality and integrity protection in a single operation. For each chunk c h u n k i , the raw payload is sealed as follows:
AEAD i = Enc AES - GCM ( c h u n k i , p d k i ; nonce i , aad i ) .
Here, nonce i denotes a unique per-chunk nonce, and aad i is the associated data bound to the ciphertext (e.g., chunk index or timestamp). For each data value v i , j , define the stream-style ciphertext and homomorphic tag:
c i , j = ( v i , j + s k i , j s k i , j + 1 ) mod p , σ i , j = m a c k i , j m a c k i , j + 1 c i , j X mod p .
Collect them into vectors:
s v c i = { c i , 0 ,   c i , 1 ,   } , σ i = { σ i , 0 ,   σ i , 1 ,   } .
Finally, the complete ciphertext package is
c c i = ( s v c i ,   σ i ,   AEAD i ) ,
which is uploaded to the data server D S for secure storage.
Decrypt ( s v c i , { s k i , j } j ) { v i , j } j , v a g g . Given authorized keys, an individual value is recovered by:
v i , j = c i , j s k i , j + s k i , j + 1 mod p .
For aggregates over an interval [ s , e ) , the telescoping property requires only boundary keys:
v a g g = j = s e 1 c i , j s k i , s + s k i , e mod p .
Verify ( c ,   σ ,   m a c k start ,   m a c k end ) { 0 , 1 } . ( c ,   σ ) represent either a single pair ( c i , j ,   σ i , j ) or an aggregated pair ( c a g g ,   σ a g g ) , and the corresponding authentication keys ( m a c k start ,   m a c k end ) denote either consecutive or boundary values in the verification chain. The correctness condition is defined as
m a c k start m a c k end = σ · X + c mod p .
If the equation holds, it outputs 1; otherwise it outputs 0.
RawShare ( [ t s , t e ] ) N [ t s , t e ] . Given a requested time interval [ t s , t e ] , the data owner selects the minimal subset of the Key Derivation Tree (KDT) that enables reconstruction of all chunk keys { k i } whose time windows t w i lie within this range. Formally,
N [ t s , t e ] = MinCover T KDT { k i t w i [ t s , t e ] } ,
where MinCover T KDT ( · ) returns the minimal set of internal KDT nodes whose subtrees cover the selected leaves. The consumer can derive all required per-chunk keys { k i } from N [ t s ,   t e ] to decrypt the corresponding ciphertexts within [ t s ,   t e ] .
AggShare ( [ t s ,   t e ] ) ( s k s t a r t ,   s k e n d ) . Given a requested time interval [ t s , t e ] , the data owner releases only the boundary regression keys corresponding to the start and end of the interval:
( s k s t a r t , s k e n d ) = s k i s , s k i e ,
where s k i s and s k i e are derived from the one-way hash chain h i = H ( h i 1 ) . These two keys enable the authorized consumer to decrypt aggregated ciphertexts over [ t s , t e ] , without exposing individual records.

5.2. Correctness Analysis

We argue that any ciphertext–tag pair produced honestly by the DO will be accepted by the verification function executed by a DC. Fix any chunk index i and position j. The DO derives per-point keys deterministically from k i as follows:
s k i , j = HKDF ( k i , j info enc ) , s k i , j + 1 = HKDF ( k i , j + 1 info enc ) ,
m a c k i , j = HKDF ( k i , j info mac ) , m a c k i , j + 1 = HKDF ( k i , j + 1 info mac ) .
Given a plaintext value v i , j , the stream-style additive encryption computes
c i , j v i , j + s k i , j s k i , j + 1 ( mod p ) ,
and the HomMAC tag is defined as
σ i , j m a c k i , j m a c k i , j + 1 c i , j · X 1 ( mod p ) ,
which is equivalent to σ i , j m a c k i , j m a c k i , j + 1 c i , j X 1 ( mod p ) because X Z p is invertible. The DC’s verification procedure checks
m a c k i , j m a c k i , j + 1 ? σ i , j · X + c i , j ( mod p ) .
Substituting the honest σ i , j into the verifier’s equation yields
σ i , j · X + c i , j m a c k i , j m a c k i , j + 1 c i , j + c i , j m a c k i , j m a c k i , j + 1 ( mod p ) ,
which shows that the check holds identically whenever ( c i , j , σ i , j ) are generated as specified. Because HKDF and all field operations are deterministic given their inputs, the verification outcome is deterministic as well. Hence, an honestly generated pair is accepted with probability 1 over all internal randomness of key derivation and encryption. The same algebra establishes correctness for aggregated pairs produced in the encrypted domain. When a server returns
c agg = = s e 1 c i , mod p , σ agg = = s e 1 σ i , mod p ,
over a contiguous index interval [ s , e ) , the corresponding boundary authentication keys ( m a c k i , s , m a c k i , e ) satisfy the telescoping identity
= s e 1 ( m a c k i , m a c k i , + 1 ) = m a c k i , s m a c k i , e ( mod p ) .
Substituting into the verification equation yields
σ agg · X + c agg m a c k i , s m a c k i , e ( mod p ) ,
which shows that aggregated verification is correct for all range queries consisting of consecutive stream indices. The scheme does not attempt to verify arbitrary non-consecutive index sets, as the telescoping property holds only for contiguous intervals.

5.3. Formal Security Analysis

The analysis focuses on two primary aspects: (i) semantic security of the symmetric additive encryption under the pseudorandom function (PRF) assumption, and (ii) integrity, correctness and unforgeability of the HomMAC-based verification mechanism. The former ensures data confidentiality under chosen-plaintext attacks (IND-CPA), while the latter guarantees that no adversary can modify, forge, or inject ciphertext–tag pairs without detection.

Confidentiality

Let F be a secure PRF. The per-point additive encryption in our scheme is defined as:
Enc ( v i , j ) = ( v i , j + F ( k root , ( i , j ) ) F ( k root , ( i , j + 1 ) ) ) mod p .
For any security parameter λ , the encryption algorithm uses distinct PRF inputs for every pair ( i , j ) , and each key pair ( s k i , j , s k i , j + 1 ) is used only once. We emphasize that the only requirement we impose is that each encryption operation uses a unique pair of PRF inputs ( i , j ) , so that the masks F ( k root , ( i , j ) ) and F ( k root , ( i , j + 1 ) ) are never reused. This is the standard non-reuse condition for PRF-based stream encryption and constrains only the generation of one-time pads; it does not restrict how many times a ciphertext or the underlying data point may subsequently be accessed, queried, or aggregated in a streaming analytics setting.
The following theorem establishes semantic security of the encryption layer under this standard non-reuse condition.
Theorem 1.
Given a prime modulus p and a secure PRF F, the proposed symmetric additive encryption scheme achieves indistinguishability under chosen-plaintext attack (IND-CPA) in the stateful streaming setting where the encryption algorithm maintains an internal state and guarantees that each encryption query uses fresh PRF inputs ( i , j ) , i.e., the masks F ( k root , ( i , j ) ) and F ( k root , ( i , j + 1 ) ) are never reused.
Proof. 
We establish the theorem through a hybrid game-based proof, where the adversary’s advantage is analyzed across two indistinguishable experiments.
  • Game 0 (Real IND-CPA Experiment).
In the first experiment, denoted as the real-world setting, the challenger C initializes the system by sampling a master secret key k root { 0 , 1 } λ . All encryption queries from the adversary A are answered using the actual pseudorandom function (PRF) F ( k root , · ) . During the challenge phase, A submits two plaintext messages ( m 0 , m 1 ) associated with a unique index ( i , j ) . The challenger then chooses a random bit b { 0 , 1 } and responds with the ciphertext:
c = m b + F ( k root , ( i , j ) ) F ( k root , ( i , j + 1 ) ) mod p .
Finally, A outputs a guess b for b, and the corresponding advantage in this game is defined as
Adv 0 = | Pr [ b = b ] 1 2 | .
  • Game 1 (Random Function Substitution).
The second experiment follows the same structure as Game 0, except that the challenger replaces the genuine PRF F ( k root , · ) with a truly random function R : { 0 , 1 } * Z p . The function R is lazily sampled and remains consistent for identical inputs. For the challenge ciphertext, the computation proceeds as
c = m b + R ( i , j ) R ( i , j + 1 ) mod p .
Let Adv 1 denote the adversary’s advantage in distinguishing this game from random guessing.
  • Analysis.
By the PRF security assumption, no polynomial-time adversary can distinguish between a real PRF and a truly random function with non-negligible advantage. Therefore,
| Adv 0 Adv 1 | negl ( λ ) .
When R is truly random, the values R ( i , j ) and R ( i , j + 1 ) are independent and uniformly distributed over Z p . Thus, their difference δ = R ( i , j ) R ( i , j + 1 ) mod p is also uniformly distributed. Hence, the challenge ciphertext
c = m b + δ mod p
is uniformly random and independent of b. Consequently, in Game 1 the adversary gains no information about b, so
Pr [ b = b ] = 1 2 and therefore Adv 1 = 0 .
Combining the two games, we obtain
A d v A IND - CPA = | Pr [ b = b ] 1 2 |     | Adv 0 Adv 1 |   +   Adv 1 negl ( λ ) .
Hence, the additive encryption layer is semantically secure under the PRF assumption. □
  • Hybrid Argument for Full-Stream IND-CPA Security.
The IND-CPA proof above analyzes the indistinguishability of a single encrypted position. To extend this guarantee to the entire streaming ciphertext sequence, we employ a standard hybrid argument. Suppose the stream contains N encrypted positions. Define Hybrid H 0 as the real encryption experiment where all ciphertexts use PRF-derived masks. For 1 k N , define Hybrid H k as the experiment in which the first k masks used in encryption are replaced with independent uniform elements of Z p , while the remaining N k masks are generated by the PRF. For any k, the difference between H k 1 and H k is exactly one PRF evaluation replaced by uniform randomness. Thus,
| Pr [ H k 1 ] Pr [ H k ] | Adv F PRF ( λ ) .
By the hybrid lemma,
| Pr [ H 0 ] Pr [ H N ] | N · Adv F PRF ( λ ) = negl ( λ ) ,
since N is a polynomial in the security parameter. Hybrid H N corresponds to encrypting the entire stream using independent uniform masks and therefore leaks no information about the plaintexts. Consequently, the scheme achieves full-stream IND-CPA security.
  • Security of KDT-Derived Keys.
A potential concern is whether keys derived through the KDT preserve the pseudorandomness required by the hybrid argument. Our KDT is built using the GGM construction, where each leaf key is of the form k i = F ( k root , x ( i ) ) , for a public binary label x ( i ) . By the pseudorandomness of F, each derived key k i is indistinguishable from a uniformly random λ -bit string, even given all other node keys in the tree. HKDF is then applied to produce the per-point encryption and authentication keys, and standard domain-separation arguments guarantee that these subkeys behave as independent PRF seeds. Therefore, from the adversary’s perspective, every mask used in encryption is derived from an independent pseudorandom key, and the multi-point hybrid argument applies unchanged to the streaming setting.
Remark 1.
The security of the entire key derivation hierarchy follows directly from the GGM construction of the Key Derivation Tree (KDT). Since the KDT uses a pseudorandom generator (PRG) to expand seeds into child keys, each node key k i = F ( k root , x ( i ) ) behaves as a PRF output on input x ( i ) , ensuring pseudorandomness across all derived subkeys. Furthermore, HKDF ensures domain separation between encryption and authentication keys by binding context strings info enc and info mac , so compromising one type of key does not endanger the other.
5.3.1. Forward Security of the Regression Key Chain
Theorem 2
(Forward Security). Let the regression chain be defined as h 0 = H ( s e e d ) , h i = H ( h i 1 ) , s k i = HKDF ( h i , info enc ) , where H is a one-way hash function and HKDF behaves as a secure PRF. For any time t, learning s k t does not enable any PPT adversary to compute any prior key s k i for i < t , except with negligible probability in λ.
Proof. 
Suppose, for the sake of contradiction, that there exists a PPT adversary Adv that, given s k t , can compute some s k i for i < t with non-negligible probability. By definition,
s k t = HKDF ( h t , info enc ) , h t = H ( h t 1 ) = H ( H ( H ( h 0 ) ) ) .
By the PRF security of HKDF , recovering any useful information about h t from s k t is as hard as breaking the PRF. Thus, if Adv can compute s k i for some i < t , we may assume that Adv effectively recovers h i for some i < t given h t . However, the sequence ( h 0 , h 1 , , h t ) is defined by repeated applications of the one-way function H, i.e., h j = H ( h j 1 ) for all 1 j t . Computing any h i with i < t from h t therefore requires inverting at least one application of H, which contradicts the one-wayness of H. Formally, we can build a reduction that uses Adv as a subroutine to invert H with non-negligible probability, violating the assumed security of H. Consequently, no such adversary Adv exists, and the regression key chain { s k i } i is forward secure. □
5.3.2. Integrity
The integrity of the encrypted stream is guaranteed by the HomMAC verification mechanism, which enables algebraic consistency checks between ciphertexts and their associated tags. For each data value v i , j , the DO computes the homomorphic tag in Equation (6) and the DC verifies each pair ( c i , j , σ i , j ) by checking Equation (7). If the ciphertext and tag are generated honestly, this equation holds identically; otherwise, any modification of c i , j or σ i , j breaks the equality with probability 1 / p . By the unforgeability of the underlying MAC construction, an adversary who does not know ( m a c k i , j , m a c k i , j + 1 ) cannot produce a valid ( c , σ ) satisfying the verification equation except with negligible probability. Therefore, the HomMAC component provides algebraic integrity verification with perfect completeness and computational soundness:
Pr [ Verify ( c , σ , m a c k start , m a c k end ) = 1 ] = 1
for all honestly generated ciphertexts.
Theorem 3
(EUF–CMA Security of HomMAC). Assume that HKDF, when keyed by a secret master value, behaves as a secure PRF over Z p , where p 2 λ is a prime. Then the HomMAC scheme defined in Section 5 is existentially unforgeable under chosen-message attacks according to Definition 5, even when the adversary is allowed to freely apply the public homomorphic operation ( c , σ ) ( c , σ ) = ( c + c , σ + σ ) .
Proof. 
We follow a standard two-game approach.
Game 0 (Real EUF–CMA experiment).
The challenger derives all m a c k i , j = HKDF ( h i , j info mac ) and answers each query with the honest ciphertext–tag pair, while the adversary A may also form homomorphic combinations of these outputs. Let Adv 0 be the probability that A produces a valid forgery not contained in the homomorphic closure S (Definition 5).
  • Game 1 (HKDF outputs replaced by a random function).
In this game we replace HKDF by a truly random function R; thus all m a c k i , j become independent uniform elements of Z p . The challenger still answers all queries according to the HomMAC specification. By PRF security of HKDF, the adversary cannot distinguish Game 0 and Game 1 except with negligible probability:
| Adv 0 Adv 1 |     negl ( λ ) ,
where Adv 1 is A ’s success probability in Game 1.
  • Analysis of Game 1.
Since the values { m a c k i , j } are independent and uniformly random, each honestly produced pair ( c i , j , σ i , j ) reveals exactly the linear relation
σ i , j X + c i , j m a c k i , j m a c k i , j + 1 ( mod p ) ,
and any homomorphically aggregated pair over a contiguous interval [ s , e ) reveals
σ agg X + c agg m a c k s m a c k e ( mod p ) ,
as given by the telescoping identity. Hence, every oracle answer corresponds to a known linear combination of the independent random variables { m a c k i , j } . Because the homomorphic operation is linear, the set S of all derivable ciphertext–tag pairs corresponds exactly to the linear span of the coefficient vectors of previously observed pairs. A forgery ( c , σ ) that is not in S therefore corresponds to a new linear combination of the m a c k -values whose coefficient vector is not in this span. Let this new combination be
T = α m a c k ( mod p ) .
Because the m a c k are independent uniform elements of Z p and ( α ) is linearly independent of all previously used coefficient vectors, the value T is itself uniform over Z p and independent of all information available to A . To succeed, A must output ( c , σ ) satisfying
σ X + c = T ( mod p ) ,
which requires guessing a fresh uniform field element. Thus
Adv 1 1 / p 2 λ .
We conclude
Adv A ( λ ) = Adv 0 | Adv 0 Adv 1 | + Adv 1 negl ( λ ) + 2 λ .
Hence, the HomMAC scheme is EUF–CMA secure even when the adversary may freely use the homomorphic aggregation property. □

6. Electric Vehicle Key Management and Query System

Building upon the secure streaming data encryption scheme introduced in Section 5, this section presents the complete Electric Vehicle (EV) key management and query system. It incorporates the Ciphertext Index Tree (CIT) for efficient statistical aggregation and defines privacy-preserving data sharing policies enabled by KDT and RKR.

6.1. Encrypted Data Organization and Indexing

The Ciphertext Index Tree (CIT) serves as a hierarchical data structure that organizes encrypted statistical vectors ( s v c i , σ i ) in a time-ordered manner, enabling efficient aggregation directly within the encrypted domain. Each node in the CIT encapsulates an aggregated ciphertext vector ( s v c , σ ) together with its corresponding index range [ f r o m , t o ) , and maintains pointers to multiple child nodes. This design allows the system to hierarchically summarize encrypted data without decryption, thus supporting scalable analytics on large and continuously updated data streams.
When a new encrypted chunk arrives, the data server dynamically inserts its associated statistical vector into the CIT. If the current leaf node still has capacity, the vector is appended directly. Otherwise, when the leaf is full, a ciphertext aggregation operation is triggered: a set of k ciphertexts are homomorphically summed, their aggregate result is stored at the parent node, and a new leaf node is subsequently opened to accommodate incoming data. Through this process, the CIT remains balanced and compact, while maintaining accurate aggregate representations of encrypted data at multiple levels.
During query processing, when a consumer requests data over a specific time interval [ t a , t b ) , the data server traverses the CIT to locate the minimal set of nodes that fully cover this interval. It then performs ciphertext-domain aggregation by summing the encrypted statistical vectors from the selected nodes:
s v c a g g = s v c i , σ a g g = σ i .
The aggregated ciphertext pair ( s v c a g g , σ a g g ) is returned to the consumer, who can decrypt and verify the result using authorized keys. After decryption, the consumer reconstructs the plaintext statistical vector and computes essential statistical indicators such as the average, variance, standard deviation, and coefficient of variation:
a v g = x 1 x 0 , v a r = x 2 x 0 x 1 x 0 2 , s d = v a r , C V = s d a v g .
These statistical metrics enable efficient and privacy-preserving electric vehicle (EV) behavioral analytics, offering deep insights into driving patterns and system performance without exposing raw user data.
Theorem 4
(CIT Query Correctness). Let T be a Ciphertext Index Tree (CIT) that maintains the coverage and value invariants after each update. For any query interval [ i s , i e ) , the CIT query algorithm returns an aggregated ciphertext vector equal to i = i s i e 1 s v c i , independent of the order in which leaves are inserted or internal aggregations are performed.
Proof. 
The CIT has two invariants and each node v stores a contiguous index range rng ( v ) = [ a v , b v ) covering exactly the leaf indices under v, and an aggregated ciphertext vector S v = i v s v c i .
  • Step 1: Correctness of interval queries.
Given a query [ i s , i e ) , the CIT query algorithm selects a set of nodes V such that: (i) their ranges are pairwise disjoint, and (ii) their union equals [ i s , i e ) . Thus, the leaves covered by V form a partition of { i i s i < i e } . Using the value invariant, we have
S agg = v V S v = v V i under v s v c i = i = i s i e 1 s v c i .
Since ciphertext addition is associative and commutative, the result depends solely on the set of leaves in the interval, not on the order of intermediate summations.
  • Step 2: Invariants hold for any insertion and aggregation order.
We argue that the CIT invariants are preserved under any sequence of leaf insertions and node aggregations. Whenever a new leaf n is inserted, it is assigned rng ( n ) = [ n , n + 1 ) and S n = s v c n , which trivially satisfies both invariants. The parent of n then updates its range to the union of its children’s ranges and its stored value to the sum of its children’s values; the same union-and-sum update is propagated up the tree. Because each child already satisfies the invariants, every ancestor updated in this way continues to store exactly the union of its descendant leaf ranges and the corresponding aggregated ciphertext value. Similarly, when a batch of sibling nodes w 1 , , w m is aggregated into a new parent node v, we define
rng ( v ) = j = 1 m rng ( w j ) , S v = j = 1 m S w j .
Since each S w j equals the sum of ciphertexts in rng ( w j ) , the value S v is exactly the sum of ciphertexts in rng ( v ) . The new range rng ( v ) is contiguous because the aggregated children cover adjacent leaf intervals. Replacing w j by v and updating ancestors via the same union-and-sum rule preserves the invariants throughout the tree. Because this update rule depends only on the sets of child ranges and ciphertext sums, and ciphertext addition is associative and commutative, the final tree structure and all node values are invariant under reordering of insertions and aggregations. □

6.2. Privacy-Preserving Access and Sharing Policies

To achieve flexible yet privacy-preserving access control, the system supports two complementary data-sharing modes: fine-grained (raw) sharing and aggregated sharing. These modes correspond to different privacy levels and analytical requirements, allowing the Data Owner (DO) to control the granularity of information disclosed to consumers.
In the fine-grained sharing mode, the DO grants access to detailed raw data over a selected time interval [ t s , t e ) . This is accomplished by releasing the minimal set of Key Derivation Tree (KDT) nodes that cover the requested range. Using these nodes, the authorized consumer can reconstruct the per-chunk keys { k i } and decrypt individual ciphertext packages, thereby obtaining the complete raw chunks c h u n k i = Dec ( c c i , k i ) . This mode is well-suited for diagnostics, auditing, or fine-resolution analytics where detailed temporal information is required. In contrast, the aggregated sharing mode offers stronger privacy guarantees by disclosing only the boundary regression keys ( s k s t a r t , s k e n d ) that correspond to the start and end of the requested time range. With these two keys, the consumer can decrypt only the aggregated ciphertext result:
v a g g = j c i , j + s k s t a r t s k e n d mod p .
Since all intermediate keys remain hidden, this mechanism prevents the reconstruction of individual data records, thereby preserving confidentiality while still supporting statistical and aggregate queries. The two-tier policy design provides adaptable access control for various application scenarios, from detailed monitoring to aggregated analytics, without compromising cryptographic soundness or forward security.

7. Experimental Results and Analysis

The experimental results were carried out on two distinct hardware platforms to assess both high-performance and edge-oriented deployment environments comprehensively. For clarity, the plaintext baseline in our experiments refers solely to the computation of statistical vectors (e.g., count, sum, and second-order moments) on raw unencrypted data, without performing any encryption, authentication, or key-handling procedures. It therefore represents the minimal processing cost before any cryptographic operations are applied. The first platform represents a general-purpose computing environment, implemented on a desktop workstation equipped with an Intel Core i7-10700 processor. This setup is intended to emulate the performance level achievable in typical data-center or server-class conditions. The second platform emulates a resource-constrained edge computing device, specifically a Raspberry Pi 4 Model B. This configuration reflects a realistic deployment scenario for low-power IoT or in-vehicle edge environments, where energy and hardware resources are limited. All algorithms, including key generation, encryption, verification, and query processing, were implemented in Java 17 using consistent cryptographic libraries and identical parameter settings across both platforms to ensure fair performance comparison.
We evaluate the efficiency of our schemes by benchmarking their computational overhead against two classical homomorphic encryption baselines: EC-ElGamal [31] and Paillier [32]. The SAHE configuration uses 128-bit keys with a 32-layer hierarchical key derivation tree and pseudorandom key expansion. For consistency, EC-ElGamal is implemented with 256-bit keys, and Paillier with a 2048-bit modulus. The experimental results obtained on a desktop platform (Device1) and a RaspberryPi (Device2) are summarized in Table 1. To ensure fairness, all schemes are benchmarked at comparable classical security levels: SAHE uses 128-bit symmetric keys; EC-ElGamal uses a 256-bit elliptic curve (≈128-bit security); and Paillier uses a 2048-bit modulus (≈112-bit security, the standard parameter most closely aligned with 128-bit symmetric strength).
As shown in Table 1, SAHE achieves the lowest encryption and decryption latency on both platforms, outperforming asymmetric schemes by over an order of magnitude. Device 2 runs about ten times slower than device 1 due to hardware limitations, yet relative performance trends remain consistent. The HomMAC verification adds minimal overhead, confirming that our symmetric, KDT-based design maintains high efficiency and suitability for both server and edge deployments.
The experimental results in Figure 3 compare the chunk encryption time of SAHE-HomMAC, EC-ElGamal, and Paillier under two encryption configurations. Figure 3a,c encrypt statistical vectors only, and Figure 3b,d perform homomorphic encryption with additional AES-GCM encryption of raw data. Device 1 represents a high-performance desktop platform, while Device 2 corresponds to a resource-constrained edge device. On Device 1, SAHE-HomMAC achieves the lowest encryption latency, requiring only 5.91 ms and 13.49 ms for the two scenarios, respectively. This performance is approximately 20× faster than EC-ElGamal and over 3000× faster than Paillier. The inclusion of AES-GCM introduces only a minor increase in runtime, demonstrating that integrating data-point encryption and HomMAC verification incurs negligible computational overhead. On Device 2, although the absolute latency rises due to limited processing power, the relative performance trend remains consistent. SAHE-HomMAC still outperforms EC-ElGamal and Paillier by a large margin, achieving 52.79 ms (vector-only) and 122.19 ms encryption times, compared to over 1 s and 190 s for the asymmetric schemes. The results confirm that the proposed symmetric KDT-based SAHE-HomMAC scheme delivers high efficiency and scalability. Even when combined with AES-GCM encryption for raw data, it maintains low latency, minimal overhead, and excellent suitability for both cloud and edge computing environments.
As shown in Figure 4a,b, the experiment measures the overall processing time for encrypted data streams, including statistical vector computation, encryption, and ciphertext index tree construction. The reported SSEDQ processing time includes both raw chunk encryption (AES-GCM for plaintext payloads together with SAHE for statistical vectors) and all homomorphic aggregation operations performed during CIT maintenance. The aggregation cost grows linearly with the number of chunks because each new chunk requires a single ciphertext-vector addition at the corresponding CIT level. Our proposed SSEDQ scheme is compared with the plaintext baseline and two asymmetric homomorphic encryption schemes, EC-ElGamal and Paillier. In this experiment, each data block represents one minute of streaming data and contains 60 data points, with 100,000 data blocks generated to simulate approximately two months of continuous data. The ciphertext index tree is constructed with a branching factor of 1024. The evaluation is conducted on both Device 1 (desktop) and Device 2 (Raspberry Pi), representing high-performance and lightweight environments, respectively. In Figure 4a, Device 1 achieves an average processing time of 2.68 s under plaintext conditions, while SSEDQ requires 6.04 s to complete statistical computation, encryption, and index tree construction. This corresponds to an additional latency of approximately 0.04 s per minute of streaming data, demonstrating that the scheme can efficiently handle high-throughput data streams. EC-ElGamal and Paillier, however, exhibit significantly higher delays—approximately 20× and 3000× slower than SSEDQ, respectively. Similarly, Figure 4b shows results on Device 2, where the average plaintext processing time is 14.02 s, and SSEDQ requires 54.82 s. Despite limited hardware capability, the performance remains within one order of magnitude, confirming that the scheme introduces only moderate computational overhead. The symmetric design of SSEDQ allows efficient encryption and aggregation even under constrained environments.
Table 2 evaluates the scalability of the key–sharing procedures with respect to the KDT depth h and the corresponding number of time windows T = 2 h . As the tree becomes deeper, the average size of the minimal cover | N [ t s , t e ] | ¯ returned by RawShare increases only mildly, reflecting the expected O ( log T ) behavior. The RawShare latency grows slowly with h and remains within a few tens of microseconds, demonstrating that fine-grained key authorization is highly scalable even for large temporal domains. In contrast, AggShare exhibits nearly constant latency because it always releases only two boundary keys, independent of the KDT depth. These results confirm that both raw and aggregated key–sharing operations scale efficiently as the temporal range expands.
As illustrated in Figure 5, the query latency in the plaintext setting grows almost linearly with the size of the query range. This is because no preprocessing is performed on the plaintext data, and the server must aggregate all values within the queried interval. Although the latency is low for small ranges, it increases significantly for large ranges due to the linear accumulation of operations. When the plaintext index tree is used, the latency for small ranges is slightly higher than in the pure plaintext setting, since the system still needs to traverse tree nodes before performing aggregation. However, for large ranges, the plaintext index tree avoids excessive per-element aggregation and therefore reduces the overall query cost. The ciphertext index tree exhibits the same behavior: the query time depends on which internal nodes are selected to answer the range query. Although some fluctuations appear as the range size changes, the overall latency remains consistently low. This demonstrates that the index-tree structure significantly improves query efficiency, particularly when the data volume is large. In our measurements, the average query time using the plaintext index tree is 39.29 μs, while the SSEDQ scheme achieves an average latency of 125.61 μs. The additional overhead (about a 3.2× increase) is attributable to the decryption and HomMAC verification procedure on the client side, yet the overall latency remains within the same order of magnitude as plaintext. In contrast, the EC–ElGamal-based scheme incurs an average query time exceeding 1 ms, and the Paillier-based scheme reaches 387.89 ms, showing a clear performance disadvantage compared to SSEDQ.
To evaluate both the construction efficiency of the Ciphertext Index Tree (CIT) and the query performance over encrypted statistical vectors, we randomly generated one million statistical vectors and inserted them into the CIT while measuring the total construction time. Furthermore, to assess query efficiency, we randomly selected one statistical vector and a range of 100,000 vectors, and recorded the corresponding query latency. The CIT is a k-ary tree; to examine the influence of different branching factors on performance, we constructed and tested 15 CIT instances with varying values of k. Since CIT construction and query processing are carried out entirely on the data server and do not involve resource-constrained devices, this experiment was conducted only on Device 1. The CIT construction time under different branching factors is reported in Figure 6.

8. Conclusions

The proposed scheme combines hierarchical key derivation and time-continuous key regression to achieve efficient and forward-secure key management. Employing symmetric additive homomorphic encryption with homomorphic message authentication enables confidentiality, verifiable integrity, and ciphertext-domain aggregation without revealing individual records. Furthermore, the Ciphertext Index Tree (CIT) supports scalable and privacy-preserving statistical queries, extending the applicability of the framework to large-scale EV data analytics. Experimental evaluations demonstrate that the proposed approach significantly reduces encryption latency and resource overhead compared with conventional homomorphic schemes, maintaining consistent performance across both cloud and edge platforms. Future work will focus on integrating dynamic multi-user access control, decentralized key management, and cross-domain interoperability to further enhance the practicality of secure EV data sharing and collaboration.

Author Contributions

Conceptualization, Z.L. and J.X.; methodology, Z.L.; software, Z.L. and C.S.; validation, Z.L., J.X. and F.W.; formal analysis, Z.L.; investigation, Z.L. and C.S.; resources, F.W., X.W. and X.F.; data curation, Z.L. and C.S.; writing—original draft preparation, Z.L.; writing—review and editing, J.X., F.W., X.W. and X.F.; visualization, Z.L. and C.S.; supervision, J.X. and X.F.; project administration, J.X.; funding acquisition, J.X. and X.F. All authors have read and agreed to the published version of the manuscript.

Funding

This work is supported by the Science and Technology Project of State Grid Corporation of China, “Research and Application of Scalable and Secure Data Circulation and System Devices for Vehicle-to-Grid Interaction” (Grant No. 5400-202471366A-3-1-KJ). The APC was funded by the Science and Technology Project of State Grid Corporation of China (Grant No. 5400-202471366A-3-1-KJ).

Data Availability Statement

The data presented in this study are openly available in CNCB-NGDC at https://download.cncb.ac.cn/gwh (accessed on 15 December 2025).

Conflicts of Interest

Authors Mr. Fan Wu and Mr. Xiangliang Fang were employed by the company State Grid Smart Internet of Vehicles Technology Co., Ltd. Author Mr. Xiaomin Wu was employed by State Grid Hubei Electric Power Company. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest. The funders had no role in the study design, data collection, analysis, interpretation, manuscript writing, or the decision to publish the results.

References

  1. Fang, T.; von Jouanne, A.; Agamloh, E.; Yokochi, A. Opportunities and Challenges of Fuel Cell Electric Vehicle-to-Grid (V2G) Integration. Energies 2024, 17, 5646. [Google Scholar] [CrossRef]
  2. Mahida, A. Secure Data Outsourcing Techniques for Cloud Storage. Int. J. Sci. Res. (IJSR) 2024, 13, 181–184. [Google Scholar] [CrossRef]
  3. Faller, S.; Ottenhues, A.; Ottenhues, J. Composable oblivious pseudo-random functions via garbled circuits. In Proceedings of the International Conference on Cryptology and Information Security in Latin America, Quito, Ecuador, 3–6 October 2023; Springer: Cham, Switzerland, 2023; pp. 249–270. [Google Scholar]
  4. Blass, E.O.; Kerschbaum, F.; Mayberry, T. Iterative oblivious pseudo-random functions and applications. In Proceedings of the 2022 ACM on Asia Conference on Computer and Communications Security, New York, NY, USA, 30 May–3 June 2022; pp. 28–41. [Google Scholar]
  5. Li, Z.; Xu, J.; Zhang, N.; Lu, T.; Li, P.; Wang, N.; Wang, Q. A Fully Homomorphic Encryption-Based KNN Classification Scheme for Electric Vehicles Data. In Proceedings of the Data Security and Privacy Protection, Xi’an, China, 16–18 October 2025; Chen, X., Hu, H., Wang, D., Eds.; Springer: Singapore, 2025; pp. 318–327. [Google Scholar]
  6. Liu, X.; Yi, X. Privacy-preserving collaborative medical time series analysis based on dynamic time warping. In Proceedings of the Computer Security–ESORICS 2019: 24th European Symposium on Research in Computer Security, Luxembourg, 23–27 September 2019; Part II 24. Springer: Cham, Switzerland, 2019; pp. 439–460. [Google Scholar]
  7. Liu, Q.; Huang, Q.; Chen, X.; Wang, S.; Wang, W.; Han, S.; Lee, P.P.C. PP-Stream: Toward High-Performance Privacy-Preserving Neural Network Inference via Distributed Stream Processing. In Proceedings of the 2024 IEEE 40th International Conference on Data Engineering (ICDE), Utrecht, The Netherlands, 13–16 May 2024; pp. 1492–1505. [Google Scholar] [CrossRef]
  8. Fang, C.; Guo, Y.; Hu, Y.; Ma, B.; Feng, L.; Yin, A. Privacy-preserving and communication-efficient federated learning in internet of things. Comput. Secur. 2021, 103, 102199. [Google Scholar] [CrossRef]
  9. Wang, Q.; Cui, S.; Zhou, L.; Wu, O.; Zhu, Y.; Russello, G. Enclavetree: Privacy-preserving data stream training and inference using tee. In Proceedings of the 2022 ACM on Asia Conference on Computer and Communications Security, New York, NY, USA, 30 May–3 June 2022; pp. 741–755. [Google Scholar]
  10. Zheng, Y.; Lu, R.; Guan, Y.; Shao, J.; Zhu, H. Efficient and privacy-preserving similarity range query over encrypted time series data. IEEE Trans. Dependable Secur. Comput. 2021, 19, 2501–2516. [Google Scholar] [CrossRef]
  11. Wang, S.; Zheng, Y.; Jia, X. GraphGuard: Private Time-Constrained Pattern Detection over Streaming Graphs in the Cloud. In Proceedings of the 33rd USENIX Security Symposium (USENIX Security 24), Philadelphia, PA, USA, 14–16 August 2024; pp. 3495–3512. [Google Scholar]
  12. Chen, D.; Chen, L. Sliding-window probabilistic threshold aggregate Queries on uncertain data streams. Inf. Sci. 2020, 520, 353–372. [Google Scholar] [CrossRef]
  13. Miao, R.; Zhang, Y.; Qu, G.; Yang, K.; Yang, T.; Cui, B. Hyper-uss: Answering subset query over multi-attribute data stream. In Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, Long Beach, CA, USA, 6–10 August 2023; pp. 1698–1709. [Google Scholar]
  14. Bai, M.; Han, Y.; Yin, P.; Wang, X.; Li, G.; Ning, B.; Ma, Q. S_IDS: An efficient skyline query algorithm over incomplete data streams. Data Knowl. Eng. 2024, 149, 102258. [Google Scholar] [CrossRef]
  15. Guan, Y.; Lu, R.; Zheng, Y.; Shao, J.; Wei, G. Achieving efficient and privacy-preserving max aggregation query for time-series data. In Proceedings of the ICC 2020-2020 IEEE International Conference on Communications (ICC), Dublin, Ireland, 7–11 June 2020; pp. 1–6. [Google Scholar]
  16. Ouyang, H.; Zheng, Y.; Wang, S.; Hua, Z. OblivTime: Oblivious and Efficient Interval Skyline Query Processing Over Encrypted Time-Series Data. IEEE Trans. Serv. Comput. 2025, 18, 1602–1617. [Google Scholar] [CrossRef]
  17. Safaee, S.; Mirabi, M.; Safaei, A.A. StreamFilter: A framework for distributed processing of range queries over streaming data with fine-grained access control. Clust. Comput. 2024, 27, 9221–9241. [Google Scholar] [CrossRef]
  18. Di Vimercati, S.D.C.; Foresti, S.; Jajodia, S.; Paraboschi, S.; Samarati, P. Over-encryption: Management of access control evolution on outsourced data. In Proceedings of the 33rd International Conference on Very Large Data Bases, Vienna, Austria, 23–27 September 2007; pp. 123–134. [Google Scholar]
  19. Qi, S.; Zheng, Y. Crypt-DAC: Cryptographically enforced dynamic access control in the cloud. IEEE Trans. Dependable Secur. Comput. 2019, 18, 765–779. [Google Scholar] [CrossRef]
  20. Tong, Q.; Miao, Y.; Chen, L.; Weng, J.; Liu, X.; Choo, K.K.R.; Deng, R.H. VFIRM: Verifiable fine-grained encrypted image retrieval in multi-owner multi-user settings. IEEE Trans. Serv. Comput. 2021, 15, 3606–3619. [Google Scholar] [CrossRef]
  21. Thushara, G.; Bhanu, S.M.S. A survey on secured data sharing using ciphertext policy attribute based encryption in cloud. In Proceedings of the 2021 8th International Conference on Smart Computing and Communications (ICSCC), Kochi, India, 1–3 July 2021; pp. 170–177. [Google Scholar]
  22. Xie, S.; Zhang, L.; Wu, Q.; Rezaeibagha, F. Flexibly expressive and revocable multi-authority KP-ABE scheme from RLWE for Internet of Medical Things. J. Syst. Archit. 2024, 152, 103179. [Google Scholar] [CrossRef]
  23. Rasori, M.; Perazzo, P.; Dini, G.; Yu, S. Indirect revocable KP-ABE with revocation undoing resistance. IEEE Trans. Serv. Comput. 2021, 15, 2854–2868. [Google Scholar] [CrossRef]
  24. Kumar, D.; Kumar, M. Outsourcing decryption of KP-ABE using elliptic curve cryptography. Int. J. Inf. Comput. Secur. 2023, 22, 210–229. [Google Scholar] [CrossRef]
  25. Deng, H.; Qin, Z.; Wu, Q.; Guan, Z.; Zhou, Y. Flexible attribute-based proxy re-encryption for efficient data sharing. Inf. Sci. 2020, 511, 94–113. [Google Scholar] [CrossRef]
  26. Xiong, H.; Wang, L.; Zhou, Z.; Zhao, Z.; Huang, X.; Kumari, S. Burn after reading: Adaptively secure puncturable identity-based proxy re-encryption scheme for securing group message. IEEE Internet Things J. 2021, 9, 11248–11260. [Google Scholar] [CrossRef]
  27. Ge, C.; Susilo, W.; Baek, J.; Liu, Z.; Xia, J.; Fang, L. Revocable attribute-based encryption with data integrity in clouds. IEEE Trans. Dependable Secur. Comput. 2021, 19, 2864–2872. [Google Scholar] [CrossRef]
  28. Mo, Z.; Qiao, Y.; Chen, S. Two-party fine-grained assured deletion of outsourced data in cloud systems. In Proceedings of the 2014 IEEE 34th International Conference on Distributed Computing Systems, Madrid, Spain, 30 June–3 July 2014; pp. 308–317. [Google Scholar]
  29. Wang, W.; Li, Z.; Owens, R.; Bhargava, B. Secure and efficient access to outsourced data. In Proceedings of the 2009 ACM Workshop on Cloud Computing Security, New York, NY, USA, 13 November 2009; pp. 55–66. [Google Scholar]
  30. Alderman, J.; Farley, N.; Crampton, J. Tree-based cryptographic access control. In Proceedings of the European Symposium on Research in Computer Security, Oslo, Norway, 11–15 September 2017; Springer: Cham, Switzerland, 2017; pp. 47–64. [Google Scholar]
  31. Liu, B.; Blancaflor, E.B. Data Security and Privacy Protection Scheme Based on EC-ElGamal in Federal Learning. SN Comput. Sci. 2025, 6, 170. [Google Scholar] [CrossRef]
  32. Shi, L. Design of Secure Fair Bidding Scheme Based on Threshold Elliptic Curve ElGamal Cryptography. In Proceedings of the International Conference on Emerging Internet, Data & Web Technologies, Naples, Italy, 21–23 February 2024; Springer: Cham, Switzerland, 2024; pp. 163–172. [Google Scholar]
Figure 1. System Model.
Figure 1. System Model.
Information 17 00018 g001
Figure 2. Key Derivation Tree.
Figure 2. Key Derivation Tree.
Information 17 00018 g002
Figure 3. Encryption Time Overhead Across Schemes. (a) Encryption of statistical vectors only on Device 1. (b) Homomorphic encryption of statistical vectors with AES-GCM encryption of raw data on Device 1. (c) Encryption of statistical vectors only on Device 2. (d) Homomorphic encryption of statistical vectors with AES-GCM encryption of raw data on Device 2.
Figure 3. Encryption Time Overhead Across Schemes. (a) Encryption of statistical vectors only on Device 1. (b) Homomorphic encryption of statistical vectors with AES-GCM encryption of raw data on Device 1. (c) Encryption of statistical vectors only on Device 2. (d) Homomorphic encryption of statistical vectors with AES-GCM encryption of raw data on Device 2.
Information 17 00018 g003
Figure 4. Stream Encryption Processing Time Comparison Across Schemes and Devices. (a) Stream encryption processing time on Device 1. (b) Stream encryption processing time on Device 2.
Figure 4. Stream Encryption Processing Time Comparison Across Schemes and Devices. (a) Stream encryption processing time on Device 1. (b) Stream encryption processing time on Device 2.
Information 17 00018 g004
Figure 5. Query Latency versus the Number of Aggregated Chunks.
Figure 5. Query Latency versus the Number of Aggregated Chunks.
Information 17 00018 g005
Figure 6. CIT Construction Time under Different Branching Factors k.
Figure 6. CIT Construction Time under Different Branching Factors k.
Information 17 00018 g006
Table 1. Performance comparison of average encryption and decryption latency on two devices.
Table 1. Performance comparison of average encryption and decryption latency on two devices.
PlatformPhaseOurs (μs)SAHE + HomMAC (μs)EC–ElGamal (ms)Paillier (ms)
Device 1Encryption33.3533.610.501131
Decryption33.4433.870.333129
Device 2Encryption334.42346.064.921260
Decryption334.84349.073.541230
Table 2. Scalability of key sharing with respect to the KDT depth on Device 1.
Table 2. Scalability of key sharing with respect to the KDT depth on Device 1.
Depth hWindows T = 2 h | N [ t s , t e ] | ¯ RawShare Latency (μs)AggShare Latency (μs)
1010247.38.414.02
1240968.610.274.11
1416,38410.214.634.19
1665,53611.918.734.24
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Li, Z.; Xu, J.; Wu, F.; Sun, C.; Wu, X.; Fang, X. Secure Streaming Data Encryption and Query Scheme with Electric Vehicle Key Management. Information 2026, 17, 18. https://doi.org/10.3390/info17010018

AMA Style

Li Z, Xu J, Wu F, Sun C, Wu X, Fang X. Secure Streaming Data Encryption and Query Scheme with Electric Vehicle Key Management. Information. 2026; 17(1):18. https://doi.org/10.3390/info17010018

Chicago/Turabian Style

Li, Zhicheng, Jian Xu, Fan Wu, Cen Sun, Xiaomin Wu, and Xiangliang Fang. 2026. "Secure Streaming Data Encryption and Query Scheme with Electric Vehicle Key Management" Information 17, no. 1: 18. https://doi.org/10.3390/info17010018

APA Style

Li, Z., Xu, J., Wu, F., Sun, C., Wu, X., & Fang, X. (2026). Secure Streaming Data Encryption and Query Scheme with Electric Vehicle Key Management. Information, 17(1), 18. https://doi.org/10.3390/info17010018

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop