# Data Privacy Preservation and Security in Smart Metering Systems

## Abstract

## 1. Introduction

- Establishing a pivot for the interested researchers of the importance of SMs;
- Quantifying the existing problems regarding operation stability;
- Addressing the common defense mechanisms employed for protecting SMs’ data privacy;
- Proposing a comprehensive comparison of the involved trust models for SMs’ data privacy;
- Highlighting the role of SMs’ security for disaster management;
- Recommending the future directions of the SM privacy for the interested scholars.

## 2. Existing Problems of Smart Meters

#### 2.1. Optimal Power Flow

#### 2.2. Privacy Problems

## 3. Defense Mechanisms of Smart Meters

#### 3.1. Differential Privacy

#### 3.2. Machine Learning

#### 3.3. Kullback–Leibler Divergence

#### 3.4. Game Theory

- The strategic interaction between competing or cooperative interests when the limitations and compensation for actions are taken into account is referred to as a game.
- A player is a fundamental component of a game. Each player in the game, given by the number i, is responsible for acting rationally, as shown by the symbol ${A}_{i}$. In a game, a player may take the place of a human, a machine, or a team of players N.
- The Utility/Payoff is expressed by the reward or punishment to a player based on a given action throughout the game given by ${u}_{i}:A\to \mathbb{R}$, which calculates the output for the ith player, and pinpointed by the participating players actions $A={\times}_{i\in N}{A}_{i}$, where the symbol × represents a Cartesian product.
- A strategy is defined by an action plan throughout the game in which a player can adopt a strategic game $\u2329N,\left(A\right),\left({u}_{i}\right)\u232a$.

- NE is defined as the profile of the optimal action, ${a}^{*}\in A$, as a player $i\in N$ is not able to gain from unilaterally changing its course and opting for a different course of action [77,80]. This process is reflected by the utility function as ${u}_{i}\left({a}_{i}^{*},{a}_{-i}^{*}\right)\ge {u}_{i}\left({a}_{i},{a}_{-i}^{*}\right)$ for all ${a}_{i}\in A$, where ${a}_{i}$ is the player strategy i and ${a}_{-i}$ represents the strategies of all players except i [75].

#### 3.5. Generative Adversarial Privacy

#### 3.6. Data Aggregation

#### 3.7. Pseudonyms

#### 3.8. Clustering

#### 3.9. Entropy

#### 3.10. Fuzzy

#### 3.11. Bayesian

## 4. Existing Solutions Comparison

## 5. Recommendations for Future Directions

- Analyzing the impact an attacker could have on the data release framework if they have prior knowledge about their victim that was obtained without using smart meter data;
- Combining their algorithm with physical distortion methods, such as renewable energy or batteries, could further increase a consumer’s privacy by shaping their demand profile;
- Observing the privacy impact of the DP algorithms when the resolution of the model, or the data collection interval, is over a specific time threshold;
- Investigating the effects of privacy preservation against non-intrusive load monitoring techniques;
- Providing more inference privacy techniques;
- Considering the modern technologies and adaptive protocols of the Internet of Things to attain acceptable disaster management and risk mitigation.

## 6. Conclusions

Abbreviation | Description |
---|---|

SMs | Smart Meters |

PG | Power Grid |

KL | Kullback–Leibler |

ANN | Artificial Neural Network |

RNN | Recurrent Neural Network |

LSTM | Long Short-Term Memory |

DNN | Deep Neural Network |

ML | Machine Learning |

DL | Deep Learning |

DP | Differential Privacy |

ETD | Electrical Theft Detection |

Adaboost | Adaptive Boost |

AC | Alternative Current |

PGN | Power Grid Network |

HPPs | High-Priority Packets |

HPT | High-Priority Data Trustworthiness |

GAP | Generative Adversarial Privacy |

NE | Nash Equilibrium |

HC | Hierarchical Clustering |

KM | K-Medoids |

FPGA | Field Programmable Gate Array |

VHDL | VHSIC Hardware Description Language |

DPMDs | Distribution-level Phasor Measurement Devices |

MAC | Medium Access Control |

BDST | Dempster–Shafer theory |

Notation | Description |
---|---|

${L}_{\mathrm{releaser}}$ | Releaser loss function |

${L}_{\mathrm{attacker}}$ | Attacker loss function |

$\alpha $ | The releaser parameter |

$\beta $ | The attacker parameter |

$\gamma $ | A parameter that controls the privacy-utility trade-off |

${R}^{T}$ | The released data |

${U}^{T}$ | The useful data |

${D}^{T}$ | The private data |

${\mathbf{C}}_{t}$ | Cell state |

${\mathbf{h}}_{t}$ | Hidden state |

${\mathbf{f}}_{t}$ | Forget gate |

${\mathbf{g}}_{t}$ and ${\mathbf{i}}_{t}$ | Input gates |

${\mathbf{O}}_{t}$ | Output gate |

b | The biases |

K | Input weights |

V | Recurrent weights |

T | Total period |

${u}_{i}$ | The utility of the $i\text{th}$ player |

${A}_{i}$ | The action of the $i\text{th}$ player |

A | The actions of all players |

N | The number of players |

${a}^{*}$ | The optimal action |

${a}_{-i}$ | The strategies of all players except i |

Security Model | Advantages | Disadvantages | Recommendation |
---|---|---|---|

Game theory | The utility of the nodes is calculated when the association between nodes is analyzed as a cooperative and non-cooperative game. The game model addresses the logical issue involving the rational participants. | Complexity of implementation. | Medium |

Clustering | The cluster’s node rearrangement and network scalability. Easy to put into action. | Very significant overheads for control. Certain protocols have a long transmission latency. Sophisticated algorithms. | Low |

Bayesian | The degree of confidence is taken into account while making decisions. | Scalable network design cannot be taken into account since assessment is solely focused on the node’s QoS. | High |

Entropy | Inspired by the theory of thermodynamics that deals with the degree of uncertainty in a signal or random occurrence, employed for ad hoc. | Handles attacks individually. | Low |

Fuzzy | To address a control issue, it inserts a number of if–then rules. | Memory overflow results from adding more if–else statements. | Medium |

Differential privacy | Can interactively support machine learning models. | Expensive computation, does not support sufficient performance with high complexity problems. | High |

Machine learning | Easy to pinpoint patterns, supports full automation, supports several applications. | Long training time, high probability of error, big datasets are needed. | High |

Kullback-Leibler Divergence | It depicts the information loss between expected and ground truth distribution. | Some samples produced by the model may not fit the data distribution. | Meduim |

Generative Adversarial Privacy | Easy to combine with machine learning, easy to interpret its generated data. | Oscillation of model’s parameters leads to non-convergence, the generator can collapse. | Low |

Data Aggregation | Data aggregation aids in condensing information from various, dissimilar, and many sources. | If data are not gathered and organized meaningfully, they are difficult to identify and analyze. | Low |

Pseudonyms | Anonymity. | Twice the identities. | Low |

