Privacy-Preserving Smart Metering with Authentication in a Smart Grid

The traditional security objectives of smart grids have been availability, integrity, and confidentiality. However, as the grids incorporate smart metering and load management, user and corporate privacy is increasingly becoming an issue in smart grid networks. Although transmitting current power consumption levels to the supplier or utility from each smart meter at short intervals has an advantage for the electricity supplier’s planning and management purposes, it threatens user privacy by disclosing fine-grained consumption data and usage behavior to utility providers. In this study, we propose a distributed incremental data aggregation scheme where all smart meters on an aggregation path are involved in routing the data from the source meter to the collection unit. User privacy is preserved by symmetric homomorphic encryption, which allows smart meters to participate in the aggregation without seeing any intermediate or final result. Aggregated data is further integrated with an aggregate signature to achieve data integrity and smart meter authentication in such a way that dishonest or fake smart meters cannot falsify data en route. Only the collection unit can obtain the aggregated data and verify its integrity while the individual plain data are not exposed to the collection unit. Therefore, user privacy and security are improved for the smart metering in a smart grid network. Appl. Sci. 2015, 5 1504


Introduction
Smart grids are envisioned as a next generation approach to delivering electricity to millions of households by stakeholders.Smart grids have incorporated computation and communication technology into traditional power grids, allowing them to become smart and connected.Processing and storage units are embedded in traditional electricity meters and communicate with electrical appliances at home as well as the generation and management facilities of electric utilities, providing smart grids with great connectivity.With the intelligent, networked smart meters, smart grids enable the instant monitoring of power delivery and consumption information, the subscription of power usage, remote operation, advanced demand and outage management, and usage management.
The rollout of smart meters has already begun, and this new technology has been rapidly adopted in countries throughout the world.For example, the United States and the European Union currently promote the deployment of smart grids.However, despite the numerous advantages offered by smart grids, security and privacy concerns started to arise [1][2][3].Smart meters are expected to automatically provide accurate readings at requested time intervals to the utility company or electricity distribution network.Frequent and detailed meter readings can be used to optimize the grid but can also reveal behavioral patterns and leak personal information.Fine-grained electricity consumption profiles differ between devices.Thus, given a sufficient resolution on the time axis, electric utilities could know the daily energy usage patterns of a household and even go so far as to deduct whether the inhabitants are at home, when they get up and go to bed, and what kind of devices are being used at which time, among other things [2][3][4].User privacy concerns have already jeopardized the mandatory deployment of smart meters in the Netherlands, leading to a deployment deadlock in April 2009.
Recently, several studies have proposed solutions to this problem in smart grids [3][4][5][6][7][8].One promising technical solution to protect user privacy is anonymizing each packet of high-frequency metering data by aggregating them at multiple levels (e.g., neighborhood, subdivision, district, and city) via privacy-protecting cryptographic techniques such as homomorphic encryption [9,10].In this approach, the data collection unit can obtain sums of the measurements of all the connected smart meters without learning the individual measurements.Additionally, the smart meters involved in the aggregation cannot learn any plain measurements from the other meters.Thus, customer privacy can be protected.
However, most of the well-known homomorphic encryption protocols are malleable [9][10][11].Thus, given the ciphertext and public key, an adversary could generate another cipher which decrypts to another meaningful plaintext in the same domain as the original plaintext.As a result, a dishonest or fake smart meter could falsify data, leading to inaccurate aggregation results.Therefore, it is of great importance to provide efficient and secure data aggregation which guarantees user privacy, authentication, and aggregated data integrity in smart grids.
One straightforward solution to this problem is enabling each smart meter to sign its metering data with a private key so that the collection unit can verify the integrity of data from individual meters and authenticate the source meters with their public keys.However, this approach allows the collection unit to learn the plain meter readings of each smart meter during the verification phase, which violates the privacy of users.On the other hand, when the measurements of smart meters are aggregated using the previous homomorphic encryptions [9][10][11] in order to protect user privacy, a signature scheme cannot enable the collection unit to verify the authenticity of plain (aggregated or individual) data.Therefore, it is challenging for a verifier, that is, a collection unit, to verify the integrity of aggregated metering data from multiple smart meters in a smart grid network without accessing the individual plain data.This is a problem we will attempt to solve in this study.

Related Work
The damage that smart meters can cause to user privacy has previously been studied from both a technical and a legal perspective [2,12].These studies propose the enforcement of privacy based on organizational means, codes of conduct, and regulations, subject to current legislation.Of the technical solutions that have been offered, several recent schemes have suggested protecting user privacy by aggregating individual metering data at a range of levels [3,5,7,8] or by anonymizing high-frequency metering data through the use of pseudonymous identities [4].However, the importance of data integrity has not drawn a great deal of attention in previous smart grid security literature, despite it being one of the most important security requirements (along with availability and confidentiality) in smart grid cyber security [13].
Bohli et al. [8] introduced a smart metering privacy model to measure the degree of privacy that a smart metering application can provide, and proposed a privacy-preserving aggregation scheme.The basic principle of the scheme is that each electricity meter takes its current reading and adds a random value drawn from a known distribution with a known variance σ 2 and expectation µ.The sent value is then distributed around the sum of the actual measurement and the expectation of the distribution.However, since the measurements are randomized, there is a trade-off between the precision of individual and aggregated readings.With small aggregation groups, it is not easy to achieve the desired level of privacy, which is a major drawback with regard to the practical deployment of this scheme.
Garcia et al. [3] suggested a multiparty computation protocol that allows multiple smart meters in a neighborhood to compute a partial aggregation of their data without disclosing their individual measurements by taking advantage of Paillier's additive homomorphic encryption [9].The partially aggregated data of every meter is then sent to the local substation, which is the collection unit in the neighborhood, and fully aggregated.This scheme requires that each meter compute O(N ) encryptions, and that additional O(N 2 ) communications between each meter and the local substation are made before sending the fully aggregated result to the electricity utility, where N is the number of smart meters in the neighborhood.Thus, it lacks efficiency in terms of computation and communication overheads.It is also vulnerable to data forgery at the local substation due to the malleability of homomorphic encryption.
Li et al. [5,6] proposed a distributed in-network aggregation scheme for smart grid networks.Data aggregation is performed at all smart meters involved in routing the data from the source meter to the collection unit using Paillier's homomorphic encryption [9].When {m} K represents the encryption of measurement m under the collection unit's public key K, each intermediate smart meter s j on the routing path performs {m i } K • {m j } K = {m i + m j } K and forwards it to the next smart meter, where m j is the measurement to be sent and {m i } K is the received ciphertext from the previous smart meter s i .The collection unit can then decrypt the final ciphertext with its private key, and obtain the total sum of the measurements from all of the smart meters on the routing path without knowing individual plain measurements.This scheme is efficient in terms of computation and communication cost because the in-network aggregation is integrated into the data routing process.However, this scheme requires additional cryptographic mechanisms to allow the collection unit to verify the integrity of the aggregated data and authenticate the participating meters because intermediate malicious meters or outside adversaries can easily forge intermediate aggregation data en route.
Seo et al. [14] proposed a metering data aggregation scheme using Paillier's homomorphic encryption [9].This scheme suggested an aggregated data verification mechanism using a hash tree, but it is also vulnerable to active attacks by intermediate malicious meters because it grants the task of verification to each meter rather than to the collection unit.Garofalakis et al. [15] and Castelluccia et al. [16] have also proposed verifiable in-network data aggregation schemes in sensor network literature; however, their schemes have some privacy drawbacks.In Garofalakis et al.'s scheme, intermediate aggregators can obtain individual raw data from its child nodes during in-network aggregation.In Castelluccia et al.'s scheme, the final data collection unit can generate all of the symmetric keys shared with each node.Thus, if the collection unit captures any of the encrypted data reported from the leaf nodes, it can obtain the raw data in plaintext by deriving the symmetric key and decrypting the ciphertext with that key.Therefore, these two schemes cannot be directly adapted to privacy-preserving smart metering since they violate user privacy.
When multiple metering data are aggregated using homomorphic encryption, it is impossible to verify the integrity of each measurement and aggregated result without violating user privacy because the verification process needs individual metering data which has not been aggregated.In order for the collection unit to verify the aggregated data, every smart meter s i should send its {m i } K individually together with a signature for it, where K is the public key of the collection unit.Although this solution ensures the integrity of metering data, it allows the collection unit to obtain the plain metering data, which violates user privacy.Alternatively, a homomorphic signature algorithm [17] is capable of evaluating multivariate polynomials on signed data.Given the public key and a signed data set, it allows users to delegate computations (e.g., produce a signature on the mean, standard deviation, and other statistics of the signed data) to an untrusted third party while ensuring integrity.Even though the homomorphic signature algorithm seems to enable the collection unit to ensure the integrity of the aggregated data by verifying the homomorphic signature generated for it, the homomorphic signature algorithm cannot be adopted by a smart grid environment.This is because, in the homomorphic signature setting, the delegate can produce a signature for the statistics (such as the sum) of a set of data signed by the same signer; while smart metering applications require a solution for the collection unit to ensure the integrity of the statistics of a set of data signed by different signers or smart meters.Therefore, designing an efficient and secure metering scheme that guarantees sender authentication and the integrity of aggregated data while preserving user privacy is essential.
Recently, several other smart metering schemes were proposed [18][19][20][21][22][23][24].However, the schemes are all constructed on the basis of public key based homomorphic encryption such as Paillier's encryption, which may incur much higher computational overhead on resource-limited meters compared to the symmetric approach in smart grid networks, or do not guarantee data integrity and sender authentication.Fengjun et al. [25] and Rottondi et al. [26] proposed data aggregation schemes which ensure data integrity, but they cannot preserve user privacy during smart metering.

Contribution
To fulfill the somewhat contradictory requirement of enabling the collection unit to verify the integrity of aggregated meter readings without revealing any individual measurements, we propose a novel aggregation scheme with authentication capability for smart metering.In the proposed scheme, metering data on the routing path is aggregated by exploiting a symmetric homomorphic encryption in such a way that the collection unit can obtain the aggregated data without individual plain measurements being revealed to it or intermediate smart meters en route.The collection unit can verify the integrity of the aggregated measurements by adopting an identity-based sequential aggregate signature mechanism.The identity-based sequential aggregate signature scheme enables signers (that is, smart meters) to attest to their different readings while permitting savings on bandwidth and storage in comparison to public-key solutions.Signatures are aggregated one-by-one as the aggregate-so-far moves along the path, as is natural in the routing-based data aggregation application we consider.In addition, the verification process does not reveal plain metering data to the collection unit since individual data is obfuscated from the collection unit's view.
Thus, the proposed scheme is secure against any active attacks such as false data injection or data modification attacks launched by outside adversaries or intermediate compromised smart meters on the routing path.In addition, the proposed scheme enables the collection unit to obtain aggregated data, verify its integrity, and authenticate the participating smart meters without any knowledge of individual plain measurements.Therefore, the proposed scheme is also secure against passive attacks such as eavesdropping by a curious collection unit or smart meters in the smart grid network, which preserves user privacy.As opposed to the previous schemes, the proposed scheme enables the collection unit to verify integrity of the aggregated data as a whole (rather than independent verification for each individual measurement) by exploiting the aggregate signature.The efficiency analysis and experimental result show that the proposed scheme is more efficient than the previous schemes [3,5,6,21] while still guaranteeing both user privacy and aggregated data integrity.

Organization
The remainder of the paper is organized as follows.Section 2 describes the smart grid system architecture and security requirements.Section 3 reviews the cryptographic background and protocols.In Section 4, we propose our construction.We analyze the efficiency and security of the proposed scheme in Sections 5 and 6, respectively.In Section 7, we conclude the paper.

Smart Grid Architecture
In this section, we describe the smart grid architecture and define its security model.

System Description and Assumptions
Figure 1 shows the architecture of the smart grid network.The architecture consists of the following system entities: smart meters, a concentrator, and central management facilities.Smart meters deployed in houses frequently transmit their meter readings to an electric utility for power transmission and distribution network control purposes through the communication network.Smart meters in the neighborhood communicate with each other and a concentrator, which is the data collection unit, through a wireless mesh network.The concentrator further communicates with the central management facilities such as the electric utility and grid operator through wired communication, and reliably reports the aggregated result of the meter readings in the neighborhood to the facilities.We assume that the central management facilities do not need to know which smart meter generates specific data.It is only interested in the aggregated measurements (that is, the sum of the current electricity consumption of individual smart meters) in some specified neighborhood over a given time period (e.g., every few minutes or hours) and the smart meters these metering data are associated with [3][4][5]8]; it is not interested in the current meter readings of any individual customer (Our main target environment is one where the metering data need to be reported frequently over relatively short time periods (e.g., every few minutes or hours) for grid network management or power distribution control rather than real-time pricing for billing purposes.For billing purposes, central management facilities may require the aggregate data from individual meters at comparably longer intervals (e.g., daily or monthly).User privacy would not be violated in this case because outside entities would not have access to any useful information from the measurement on the time axis.Therefore, real-time pricing is not a concern in this study.).As in Li et al.'s scheme [5,6], data from individual meters in the neighborhood are delivered to the concentrator following aggregation paths, as shown in Figure 2, and it is assumed that the smart meters do not collude with each other in our data aggregation model.We also assume that the concentrator is honest-but-curious.That is, it will honestly execute the assigned tasks and will not collude with other smart meters in the system; however, it would like to learn as much information about individual metering data as possible.

Threat Model and Security Requirements
In smart grid networks, we consider both passive attacks (e.g., eavesdropping) and active attacks (e.g., data manipulation or false data injection) by inside and outside adversaries.When defending against these attacks, the following security properties are required for secure smart metering.
1. Data confidentiality (user privacy): Individual power usage of a smart meter is considered the private information of the owner.Thus, it should not be revealed to the concentrator or other meters on the routing path during data aggregation and delivery, and passive attacks such as eavesdropping should thus be defended against.2. Data integrity: Dishonest or compromised smart meters in the network could manipulate the intermediate metering data during aggregation, causing inaccurate aggregation results.Thus, manipulation of the aggregate by active inside attack from the compromised meters should be detected by the concentrator.3. Sender authentication: Defending against any active outside attacks, such as false data injection attack by outside adversaries, the concentrator should be able to ensure the authenticity of the smart meters' identities on the routing path that have contributed to the data aggregation.

Notations
If X and Y are strings, then X||Y denotes the concatenation of X and Y .If S is a set, then s $ ← S denotes that s is selected uniformly at random from S. For a probabilistic algorithm A, x $ ← A assigns the output of A to the variable x.If A is deterministic, we drop the dollar sign above the arrow.All algorithms considered in this study are possibly randomized unless indicated otherwise.

Bilinear Pairings
Let G 1 , G 2 , and G T be three cyclic groups of prime order p.Let u and v be generators of G 1 and G 2 , respectively.Bilinear map e is the map e : G 1 × G 2 → G T with the following properties.
Like many pairing-based cryptographic protocols, our protocol uses a special form of bilinear map called a symmetric pairing where G 1 = G 2 .For the remainder of the paper, all bilinear pairings are symmetric, and we denote G 1 = G 2 by G.
Weil pairing [27] or Tate pairing [28] on elliptic curves can be used as an efficiently computable non-degenerate bilinear map.

Computational Diffie-Hellman Problem
We first recall the well-known computational Diffie-Hellman (CDH) problem in the group G of prime order p.For a generator g ∈ G, we define the CDH-advantage of algorithm A as

Identity-based Sequential Aggregate Signature (IBSAS) Problem
We introduce the CDH-type computational problem IBSAS by recapitulating the definition in [29], on which the security of the identity-based sequential aggregation scheme is based.
For all a 1 , b 1 , a 2 , b 2 ∈ Z p and generator g ∈ G, the associated oracle We then define the IBSAS-advantage of an algorithm A as where we require that the oracle was not queried about m ∈ Z p by A. [29] showed that the IBSAS problem is hard in the generic bilinear group model.

Homomorphic Encryption
Homomorphic encryption represents a group of semantically secure encryption functions that allow certain algebraic operations on the plaintext to be performed directly on the ciphertext.Homomorphic encryption is useful in scenarios where someone who does not have a decryption key needs to perform arithmetic operations on a set of ciphertexts.Thus, homomorphic encryption is usually used for privacy-preserving operations (e.g., voting) in which operations are performed but operands are not disclosed.The most common definition is as follows.
Let {•} k denote a probabilistic encryption scheme under key k.Let M denote the message space and C the ciphertext space.An encryption scheme is said to be homomorphic if for given {m 1 } k1 = c 1 and {m 2 } k2 = c 2 , there exists a key k such that for some operators M in M and C in C. If (M, M ) and (C, C ) are groups, we have a group homomorphism.We say a scheme is additively homomorphic if we consider addition operators, and multiplicatively homomorphic if we consider multiplication operators.

Aggregate Signatures
Aggregate signatures [29,[32][33][34] allow multiple signers to sign different messages while keeping the total signature size constant.Each signer forwarding a message adds its signature to the label of the next signer on the advertised route, so that route authenticity can be verified upon receipt of the aggregate.Thus, aggregate signatures can be used to prevent unauthorized parties from extending the path and enables signers to sign their own messages.
However, solutions based on public key infrastructure (PKI) [32][33][34] incur overhead in the distribution of the public key and certificate of each user to all other users.Boldyreva et al. [29,35] proposed identity-based sequential aggregate signature schemes for routing-based applications (Recently, Hwang et al. [36] has proven that the assumption on which the aggregate signature scheme proposed in [35] was based is false and that the scheme is universally forgeable, i.e., anyone can generate forged signatures on any message of its choice.).In identity-based cryptography, an arbitrary identity string acts as a user's public key and verifying a signature only requires knowledge of a sender's identity.The overhead associated with obtaining and storing identity-based keys is typically much smaller than that of obtaining traditional public keys and certificates for the signers, since the identity-based setting eliminates it.Thus, an identity-based solution can offer a superior alternative to previous PKI-based approaches.
An identity-based sequential aggregate signature suits the verification of metering data aggregates in smart grid networks because signatures are aggregated one-by-one as the aggregate-so-far moves along the path, which is natural in the routing-based applications we consider.It simplifies key management and reduces storage overhead for smart meters, while providing a way for the concentrator to verify the integrity of the aggregate and authenticate the participating meters on the routing path.Thus, we will adopt an identity-based sequential aggregate signature scheme [29] in our construction.

Privacy-Preserving Smart Metering with Authentication Capability
In this section, we propose a privacy-preserving smart metering scheme.The proposed scheme consists of aggregation path generation and path key establishment followed by data encryption (using symmetric homomorphic encryption [37]) and signature generation (using an identity-based sequential aggregate signature algorithm [29]).The proposed scheme allows the concentrator to aggregate metering data from smart meters on the routing path and verify the authenticity of the aggregated result.Thus, both user privacy and data integrity are achieved during the reporting of metering data.

Symmetric Homomorphic Encryption
To tackle the authentication issue introduced by previous aggregation schemes that use Paillier's asymmetric homomorphic encryption [9] in which signature generation and verification coud not be integrated into an aggregation scheme without revealing individual (i.e., not aggregated) metering data to the concentrator, we exploit symmetric additively homomorphic encryption [37].This encryption algorithm is a slightly modified version of a stream cipher, replacing the exclusive-OR operation with modular addition.It is provably secure and allows the efficient aggregation of encrypted data.It is composed of the following two algorithms: Enc and Dec. Additive homomorphism can be achieved simply.Let In practice, if n different ciphertexts are added, M should be larger than n i=1 c i for correct decryption.It is important to note that each ciphertext that needs to be added can be encrypted under different keys, as opposed to Paillier's encryption.
Theorem 1.The above encryption scheme is perfectly secure (A cryptosystem is perfect secure if the a posteriori probability that the plaintext is x, given that the ciphertext y is observed, is identical to the a priori probability that the plaintext is x.).
Proof.The homomorphic encryption described above is a simple generalization of the stream cipher, where bits are replaced by integers modulo M .In terms of security, it has exactly the same properties as a stream cipher, that is, perfect secrecy if and only if the keystream is random, of the same length as the plaintext, and used only once.Thus, its security can be proven using a similar proof.The security relies on two features: the key changes from one message to another, and all of the operations are performed modulo integer M .
Intuitively, if we have Pr[Enc k (m 1 ) = c] = Pr[Enc k (m 2 ) = c] for every m 1 , m 2 in the plaintext space, the encryption is perfectly secure because every plaintext has the same probability of being encrypted and results in a given ciphertext.The formal security proof is given in [37].

Identity-Based Sequential Aggregate Signature
The aggregate signature scheme consists of the following four algorithms: Setup, KeyDer, Sign, and V f .1. Setup.The trusted key generation center (KGC) initially runs the Setup algorithm to generate the master public key mpk and master secret key msk.2. Key Derivation.The KGC runs the KeyDer on the input of msk and user identity ID ∈ {0, 1} * , and outputs the private key sk ID for the user's ID. 3. Sign.A signer with ID runs the Sign algorithm on the input of sk ID , message m ∈ {0, 1} * , aggregate-so-far σ, and list of identity-message pairs L i−1 = ((ID 1 , m 1 ), . . ., (ID i−1 , m i−1 )), and returns new aggregate signature σ ; or ⊥ if any of the inputs are invalid.4. Verification.The verifier runs the V f algorithm on the input of mpk, list of identity-message pairs L n = ((ID 1 , m 1 ), . . ., (ID n , m n )), and aggregate signature σ, and returns 1 or 0.

← Setup
For all i = 1, . . ., n do The proposed scheme is constructed based on the primitive identity-based aggregate signature protocol in [29].The following theorem show that the signing and verification algorithms in the proposed scheme are secure in the random oracle model if the IBSAS problem is difficult relative to its associated bilinear-group generator G.
Theorem 2. Let G be a bilinear-group generation algorithm and AS be the associated identity-based sequential aggregate signature scheme given by Sections 4.2.3 and 4.2.4.Suppose there exists a forger F against AS in the random oracle model that makes at most q h 1 queries to its hash oracles, at most q k queries to its key derivation oracle, at most q s queries to its signing oracle, and outputs lists of length at most n max .There is then an algorithm B for the IBSAS problem relative to G such that where l min (G) is the minimum bit-length of the order p of a bilinear group output by G.
Proof.The proof is given in [29].

Scheme Construction
In this section, we construct our metering data aggregation scheme with authentication capability.

System Setup
In the initial system setup phase, system security parameters and users' secret keys are set up.Aggregation paths are then set up as follows.
The KGC then runs the KeyDer algorithm on the input of msk and ID ∈ {0, 1} * , and returns secret key sk ID = (H 1 (ID) α 1 , H 2 (ID) α 2 ) to the smart meter associated with identity ID.Henceforth, ID i denotes the identity of a smart meter s i .

Aggregation Path Setup
The aggregation path is constructed so that all of the smart meters in a neighborhood are covered.Figure 2 shows an example of the aggregation path constructed in a neighborhood.If we consider the smart meter network as a connected graph where smart meters are vertices and wireless links between any two meters are edges, the aggregation tree is constructed as a spanning tree of the graph rooted at the concentrator.The aggregation task is performed on the basis of each aggregation path.All or a subset of the smart meters on the aggregation path participate in the aggregation task.The construction of an efficient aggregation path is beyond the scope of this study.
Without a loss of generality, we will describe data encryption/decryption and signature generation/verification algorithms below under the assumption that a set of smart meters are deployed on the aggregation path to the concentrator in the order s 1 , . . ., s n .

Data Encryption
After system setup, smart meters and the concentrator establish a path key, which is unique for each aggregation path.Each smart meter then delivers encrypted metering data to the concentrator following the aggregation path.A new path key is established whenever the aggregation path is changed (e.g., in case of the change of a smart grid network topology).

Path Key Establishment
When a set of smart meters are deployed on the aggregation path to the concentrator in the order s 1 , s 2 , . . ., s n , the path key is constructed as follows: where M is a large integer.Then, it securely sends k init to the concentrator via a direct secure connection, and securely sends j=1 k j , and forwards it to s i+1 securely.(For simplicity, we assume that s n+1 denotes the concentrator.)3. When the concentrator receives K n from s n , it computes stores P K as a path key for the aggregation path.
For instance, in Figure 2, the concentrator would obtain the following path keys: In the above path key establishment procedure, we assume that the secure channels between each entity can be established using the existing secure protocols such as SSL/TLS.The secure channels between each entity are used only for the secret value transmission at the path key setup phase.After the path key establishment, the secure channels are not used during the metering data aggregation procedure for efficiency.We also assume that every smart meter on the aggregation path participates in the path key computation and delivery process exactly once for each path key establishment.
The security of the path key establishment protocol is guaranteed by the randomly generated initial key k init of the leaf smart meter (s 1 in the above setting) in the aggregation tree.Since k init is added with real secret keys, the possibility of guessing k init and the real secret path keys depends on the length of the secret keys.Thus, it is impossible to determine either k init or the path keys from their composed values when M is large.When an aggregation path consists of n intermediate meters to the concentrator and the attacker can compromise m meters, the attacker cannot guess the path key as long as m < n, or the secret keys of non-compromised meters.

Data Encryption
For 1 ≤ i ≤ n, s i encrypts its measurement m i as c i = Enc k i (m i ) and forwards L i = (L i−1 , (ID i , c i )) (for a first smart meter, i.e., if i = 1, L 0 is defined as null) to the subsequent smart meter s i+1 together with the aggregate signature σ (which will be generated below) until it reaches the concentrator.

Signing
For 1 ≤ i ≤ n, s i generates the aggregate signature by running the Sign on the input of sk ID i , c i , and σ, which is an aggregate-so-far signature (for a first smart meter s 1 , σ is defined as (1 G , 1 G , 1 G )), and L i−1 .s i first parses σ as (σ 1 , σ 2 , σ 3 ) and computes It then sets σ = (σ 1 , σ 2 , σ 3 ) and forwards it to the next smart meter s i+1 together with L i .

Verification
When the concentrator receives L n and σ from s n , the aggregate signature σ has the form where r = i r i and x = i x i .The concentrator then parses σ as (σ 1 , σ 2 , σ 3 ), and runs the V f algorithm to check if e(σ 1 , g) If so, the algorithm returns 1; otherwise, it returns 0.

Data Decryption
If the verification algorithm returns 1, the concentrator runs the Dec algorithm on the input of L n and P K(= k 1 + • • • + k n ), and outputs decrypted data by computing We can easily check the correctness of the decryption in the following manner:

Discussion
In the proposed scheme, smart meters forward their own encrypted measurements, as well as those received from their child meters in the aggregation tree.Upon receipt of all the encrypted measurements for the aggregation path, the concentrator decrypts the aggregated data with the corresponding path key.In this approach, smart meters that are closer to the concentrator send and receive up to several orders of magnitude more bits than those on the periphery of the spanning tree.Thus, there is an imbalance between the smart meters in terms of the amount of data communicated.
One solution to this is to construct the spanning tree using a breadth-first traversal of the graph, starting at the concentrator.In this way, the height of the tree is the same as the shortest distance from the furthest meter to the concentrator.It reduces the maximum number of hops for the longest aggregation path, thereby reducing the end-to-end aggregation time and partially relieving (but not perfectly solving) the imbalance problem in terms of communication overhead.
Another solution to the problem is for the smart meters to perform the aggregation of their encrypted measurement with that received from the child meter.That is, smart meter s i performs c j + c i , where c j is the ciphertext received from its child meter s j , and forwards it, rather than forwarding c j and c i individually.This approach completely solves the imbalance in the smart grid network; however, it does not allow the concentrator to verify the measurements due to the malleability property of homomorphic encryption.Therefore, there is a trade-off between communication overhead and data authentication.

Efficiency
In this section, we first analyze and compare the efficiency of the proposed scheme with the previous aggregation schemes, that is Garcia et al.'s scheme (GJ) [3], Li et al.'s scheme (LLL) [5,6], and Li et al.'s scheme (LLYLLS) [21] from a theoretical perspective.We also discuss its efficiency when implemented with specific parameters.

Security Property
Table 1 shows the security properties that each scheme supports.The previous schemes [3,5,6] guarantee data confidentiality in that a passive eavesdropper cannot obtain plain metering data during the aggregation process.Thus, they provide user privacy against curious smart meters or the concentrator.However, they have an inherent weakness against active attacks due to the malleability property of the homomorphic encryption algorithm.Given the ciphertext and public key, an adversary could generate another ciphertext which decrypts to another meaningful plaintext in the same domain as the original plaintext.
Table 1.Security property comparison.

Scheme
Confidentiality Data Sender (User Privacy) Integrity Authentication GJ [3] yes no no LLL [5,6] yes no no LLYLLS [21] yes yes yes Proposed yes yes yes Unlike the previous schemes [3,5,6], the proposed scheme guarantees aggregated data integrity and smart meter authentication through the aggregate signature mechanism, as well as data confidentiality.Thus, the proposed scheme is secure against active attacks such as data manipulation by a dishonest or fake smart meter, or false data injection by outside adversaries, in addition to passive attacks.Li et al.'s scheme [21] also guarantees all the security properties, but their scheme allows the verifier to check integrity of individual metering data, not aggregated one.This incurs much higher communication and computation cost compared to the proposed scheme as will analyze in the next section.A more formal security analysis of the proposed scheme will be given in Section 6.

Theoretical Analysis
The theoretical efficiency comparison of the schemes are summarized in Table 2.The notations used in the table are: C ID bit size of an identity string C t bit size of a timestamp C m bit size of a message (=symmetric key size) N the number of smart meters participating in the aggregation Each scheme is analyzed with regard to communication, message size, signature size, private key size, and public key size.Communication represents the number of message exchanges between smart meters and the concentrator, which are required for a metering data aggregation in a neighborhood.Message represents the total size of the ciphertext and the identity information that each smart meter is required to send for an aggregation.Signature represents the size of signatures that are delivered to the concentrator for an aggregation.Private and public key represent the number of private keys each smart meter stores and the number of public keys in the system for data aggregation in a neighborhood, respectively.
In Garcia et al.'s scheme [3], every smart meter in the neighborhood encrypts its own metering data with the other meters' public keys and performs a three-way handshake with the concentrator in order to aggregate metering data in the neighborhood.Thus, the scheme needs O(N ) pairs of public keys of all smart meters participating in the aggregation, and O(N 2 ) communications.In Li et al.'s scheme [5,6], each smart meter encrypts its measurement with the aggregate-so-far message under the concentrator's public key in the hop-by-hop approach following the aggregation tree.Since the smart meters do not need to decrypt messages during aggregation, they are not given private keys.In Li et al.'s scheme [21], each meter in the same area network sends its data and signature to the local gateway directly, which is then aggregated with those from the other meters and delivered to the final concentrator.
In the proposed scheme, N + 1 more communications are needed than Li et al.'s scheme [5,6] in order to establish a path key for an aggregation path.The signature size is constant regardless of the number of signers because each signature is aggregated.Since every smart meter concatenates its data to the list of identity-ciphertext pairs and forwards it to the next smart meter until it reaches the concentrator, the message size a smart meter is required to send increases in proportion to the proximity to the concentrator.As we discussed in Section 4.3, this communication overhead is caused by security requirements that ensure data integrity and sender authentication.
If we do not consider data integrity and sender authentication, as in the previous schemes [3,5,6], the message size and private key size in the proposed scheme would be reduced to C ID + C m and C m , respectively, without any signature and public key overhead because each meter s i can perform c i = Enc k i (m i ) and forward the aggregated result C + c i to the next meter until it reaches the concentrator, where C is the ciphertext aggregated so far.This approach would thus be the most efficient of the schemes and resolve the imbalance problem (but would not be secure against active attacks, as with the previous schemes).
Figures 3 and 4 illustrate the same experimental results on different scales for clear comparison.They show the total message size for metering data aggregations in relation to the number of participating smart meters.(For the proposed scheme, the signature is included in the result.)The message size in Figures 3 and 4 are represented in bits on a linear scale and a log scale, respectively.In the simulation, we set C G = 512, C p = 160, C n = 160 2 , C n 2 = 160 4 , C ID = 64, C t = 32, and C m = 32.Each element size in the groups is chosen for an 80-bit security level.As Figure 3 demonstrates, the proposed scheme requires the least communication overhead since the message size is only determined by the number of meters in the aggregate group and the size of an identity and a message (that is, the metering data), which are much smaller than that of the element in Z * n 2 .Therefore, efficiency is much improved in the proposed scheme in terms of communication overhead.

Implementation
Next, we analyze the computation cost to aggregate metering data and verify their authenticity in a neighborhood.In the proposed scheme, the stream cipher is used to encrypt metering data from each smart meter and decrypt the aggregated metering data at the concentrator.In general, a block cipher like AES is typically 100 times faster than RSA encryption and 2000 times faster than RSA decryption, with about 60 MB per second on a modest platform [38].Stream ciphers are even faster, and our symmetric encryption is able to encrypt and decrypt 100 MB per second or more.Therefore, the computation overhead for encryption and decryption operations in the proposed scheme is negligible compared to the previous schemes [3,5,6] using Paillier's asymmetric homomorphic encryption [9,11].The most time-consuming operations in the proposed scheme are signature generation (by smart meters) and verification (by the concentrator), which are based on pairing computations.Thus, we measure only the asymmetric computations required for signature generation and verification in the proposed scheme.
We used a Type A curve (in the pairing-based cryptography (PBC) library [39]) providing groups in which bilinear map e : G × G → G T is defined.Although such curves provide good computational efficiency (especially for pairing computation), the same does not hold from the point of view of the space required to represent group elements.Indeed, each element of G requires 512 bits at an 80-bit security level and 1536 bits when 128-bit security is chosen.
Table 3 shows the computational times for signature generation by a smart meter and signature verification by a concentrator.For each operation, we include benchmark timing.Each cryptographic operation was implemented using the PBC library version 0.4.18 [39] on a PC with 3.0 GHZ processor.The public key parameters were selected to provide 80-bit security.The implementation uses a 160-bit elliptic curve group based on the supersingular curve y 2 = x 3 + x over a 512-bit finite field.The computational cost is analyzed in terms of the pairing and exponentiation operations in G and G T .The comparatively negligible hash operations are ignored.As shown in Table 3, each smart meter needs a constant time period to generate an aggregate signature for its data which is independent of the number of smart meters in the neighborhood.The concentrator, on the other hand, needs a time period which has a linear relationship with the number of meters in order to verify the aggregate signature.However, this cost could be alleviated by reducing the aggregate group size in practice (though the size should be set at more than two to provide user privacy).
In practice, the size of the aggregate group would be determined on the basis of the size of the smart grid system.For example, the aggregate group size can be determined by considering the hop counts (between the smart meter and the collection unit) and the grid network delay [5,6].When an aggregate group consists of N smart meters and the aggregation tree is constructed with a complete binary tree (the root node is the concentrator in the tree), the maximum hop counts from the meter to the concentrator is log(N + 2) − 1.Since the average hop count is 15, and the network delay for each hop is currently 20 ms for the Internet [14], N could be set to at least 2 16 − 2 ≈ 65500 in grid networks using the Internet.When it comes to the number of collectors in the grid network, it can be set to S/N ≈ S/65500 where S is the whole size or total number of nodes in the grid network.
As seen in Table 3, it takes (0.2N + 11.6)ms for the concentrator to verify the integrity of the received data while the decryption time is comparably negligible.Since the current network delay for each hop is 20ms and the signature generation by each smart meter requires 5 ms, the total time overhead would be at most (20 + 5) * (log(N + 2) − 1) + (0.2N + 11.6) ms.Thus, as long as the metering frequency is less than 1/(25 * (log(N + 2) − 1) + (0.2N + 11.6)) ms, the overhead of the proposed security mechanism would be tolerable in a smart grid network.

Security
In this section, we prove the security of the proposed scheme with regard to the security requirements discussed in Section 2.

Data Confidentiality (User Privacy)
Symmetric homomorphic encryption, which is used for metering data aggregation, is unconditionally secure, that is, perfectly secure [40].Thus, the knowledge of a ciphertext does not provide any information about either the corresponding plaintext or the key.The path key which is used as the decryption key for an aggregated ciphertext on the aggregation path can be obtained only by the concentrator as long as the smart meters and the concentrator do not collude with each other, and every smart meter participates in the path key establishment procedure on aggregation paths it belongs.In addition, under the same assumption, each intermediate smart meter can by no means obtain the other meter's secret key k i during the path key establishment procedure since the path key aggregation and distribution are done through the secure channel like SSL/TLS.Without k i , it is computationally infeasible to decrypt c i = Enc k i (m i ).Therefore, each c i does not leak information about m i without k i .Thus, any curious smart meters on the path or passive eavesdroppers cannot obtain any information about the plaintext.
In addition, the concentrator also cannot guess individual metering data from each encrypted metering measurement sent from smart meters on the routing path because it cannot determine each individual symmetric key from the combined path key.Specifically, the security of the path key establishment protocol is guaranteed by the randomly generated initial key k init of the leaf smart meter in the aggregation tree.Since the randomly generated initial key is added with real secret keys, the possibility of guessing k init and the real secret path keys depends on the length of the secret keys.Thus, it is impossible to determine either k init or the path keys from their composed values when M is large.When an aggregation path consists of n intermediate meters to the concentrator and the attacker can compromise m meters, the attacker cannot guess the path key as long as m < n, or the secret keys of non-compromised meters.
Therefore, user privacy can be preserved in the proposed scheme against passive adversaries.

Data Integrity and Sender Authentication
We now prove that our scheme guarantees aggregated data integrity and smart meter authentication for those that participate in the aggregation.
Theorem 2 implies that the aggregate signature scheme is secure against active attacks such as message forgery or false data injection.Thus, smart meters can attest to their own encrypted metering data.The concentrator can verify the integrity of each ciphertext generated from smart meters on the aggregation path, and assure the identities of the meters participating in the aggregation.Now, we prove that if each ciphertext is authentic and not forged during the aggregation (by Theorem 2), the integrity of the final aggregation result is also guaranteed.Without a loss of generality, we suppose smart meters s 1 , . . ., s n participate in the aggregation and report c 1 (= Enc k 1 (m 1 ) (mod M )), . . ., c n (= Enc k 2 (m n ) (mod M )) to the concentrator.Let the path key obtained by the concentrator be P K = k 1 + • • • + k n for the aggregation path, and C = c 1 + • • • + c n .If ∀i, c i is not forged, m i = c i − k i is also authentic for a given k i .Then, we can observe that is also authentic for a given P K.If any adversaries manipulate the distributed encryption key after the initial path key setup phase and alter a metering data (or inject false data) during the aggregation procedure, it may affect the path key and result in different ciphertext (from the authentic one), which can be detected during the verification procedure by the concentrator.Therefore, aggregated data integrity and sender authentication properties can be achieved in the proposed scheme.
However, if some authenticated meters are compromised subsequently and then launch the same attack using the valid encryption keys and identities (that is, inside adversary model), the proposed scheme cannot detect such an active attack by the compromised meters, because the attack could be done using the valid encryption keys and valid identities.Thus, dealing with the inside attack requires additional security techniques, which is one of the challenging future research issues.

Conclusions
Transmitting power consumption levels of individual customers to an electricity supplier or utility over short intervals has advantages for supplier planning and management purposes in smart grids.However, it threatens user privacy by disclosing fine-grained consumption data and usage behavior to the supplier.In this study, we proposed a novel smart metering scheme that features a mechanism that (1) preserves user privacy using symmetric homomorphic encryption so that only the data collection unit can decrypt the aggregated metering data without knowing individual meter measurements; and (2) enables the collection unit to verify the integrity of aggregated data and authenticate the identities of the smart meters participating in the aggregation by exploiting the aggregate signature.Thus, the proposed scheme provides a secure metering solution for smart grids against passive and active attacks.The proposed scheme is also efficient in terms of computation due to symmetric encryption.Because the aggregate signature is constant in size regardless of the number of smart meters in a neighborhood, the additional communication overhead for the signature is also alleviated.

1 .
Encryption.Given integer message m and randomly generated key stream k, for m, k ∈ [0, M −1] where M is a large integer, the algorithm computes c = Enc k (m) = m + k (mod M ). 2. Decryption.Given ciphertext c and key k, the algorithm computes Dec k (c) = c − k (mod M ).

Figure 3 .
Figure 3.Total message size for an aggregation (linear scale).

Figure 4 .
Figure 4. Total message size for an aggregation (log scale).

Table 3 .
Computation cost for signing and verification.