Data Privacy Preservation and Security in Smart Metering Systems

: Smart meters (SMs) can play a key role in monitoring vital aspects of different applications such as smart grids (SG), alternative currents (AC) optimal power ﬂows, adversarial training, time series data, etc. Several practical privacy implementations of SM have been made in the literature, but more studies and testing may be able to further improve efﬁciency and lower implementation costs. The major objectives of cyberattacks are the loss of data privacy on SM-based SG/power grid (PG) networks and threatening human life. As a result, losing data privacy is very expensive and gradually hurts the national economy. Consequently, employing an efﬁcient trust model against cyberattacks is strictly desired. This paper presents a research pivot for researchers who are interested in security and privacy and shade light on the importance of the SM. We highlight the involved SMs’ features in several applications. Afterward, we focus on the SMs’ vulnerabilities. Then, we consider eleven trust models employed for SM security, which are among the common methodologies utilized for attaining and preserving the data privacy of the data observed by the SMs. Following that, we propose a comparison of the existing solutions for SMs’ data privacy. In addition, valuable recommendations are introduced for the interested scholars, taking into consideration the vital effect of SM protection on disaster management, whether on the level of human lives or the infrastructure level.


Introduction
Implementing smart meters (SMs) for the observation of different metrics has become inevitable.Power grids (PG) are among the most vulnerable applications that need to be accurately monitored.The PG provides many benefits for both consumers and power utility providers.In this regard, SMs can reduce electricity costs, create a more reliable flow of power, and provide efficient viewing of information based on power consumption levels [1][2][3][4][5][6][7][8].Additionally, SMs promote renewable energy and dynamic pricing for consumers [9][10][11][12][13][14][15].Despite the logistical advantages, the increased flow of sensitive information for SMs comes with a higher opportunity for the loss of privacy for users and power providers.The undesired loss of data from a breach in SMs can prove to be catastrophic not only to power supply companies but also to individual households [16][17][18].Accordingly, engineers are constantly looking for finding methods to increase scalable privacy in SMs without sacrificing the efficiency and cost of implementation [19][20][21][22][23][24].
In [25,26], the authors presented possible solutions to implementing privacy in SMs in a smart grid (SG) environment.Those efforts focused on the use of differential privacy (DP) as a means of introducing privacy into SMs.DP essentially compares information in two different datasets with groups rather than individuals.This allows individuals to not need to share their private information over a wide area while still giving enough information to power producers to make SG technology work smoothly.The motivation behind this is that this method has the potential to keep information flow fast and efficient while keeping additional costs at a minimum level.In [26], the proposed model examined a technique called data distortion, in which companies can use algorithms to distort individual SM data while finding correct cumulative results when combined with other SM data.In [25], the authors interpreted that privacy in SMs requires significant research and mitigation because the concern over privacy remains a leading factor in the reluctance to implement SMs into the PG.Many believe that methods for ensuring privacy would either be too costly, too weak, or would make information exchange too inefficient.
The SG can assist utilities in reducing expenses, improving dependability and transparency, and streamlining procedures.However, as SM-based electric power systems are used more often, cyber security risks are growing, which raises the need for security methods.In other words, SMs can be the key player in the SG and PG networks, in which the SMs' sensitivity to the power metrics (voltage and current) is reflected by stepping up or stepping down of these metrics.For example, in such networks, SMs represent control and protection devices.Based on their measures, an action is taken to reduce or increase the electric current level to preserve the power network against destructive failures, such as in electricity production, transfer, distribution, and street lighting systems.This process is conducted to mitigate normal power consumption or as a defense model against cyberattacks.Therefore, for SG security, SM protection is crucial.It supports any security architecture and technology created and used in the SG, as well as enhances the system's reliability and resilience to cyberattacks.There are several methods to protect the privacy of SM data in the SG environment, such as DP, machine learning (ML), Kullback-Leibler (KL) divergence, game theory, generative adversarial privacy, data aggregation, pseudonyms, clustering, entropy, fuzzy, and Bayesian models.However, they have their drawbacks, such as limiting the usefulness of the data or proving computationally taxing.
In addition, one method that was used was the down-sampling of the electric usage data that are being taken in from the SM and sent to the third-party service provider [27][28][29][30][31][32].However, the problem that is faced with this method of preservation of privacy is that downsampling the data might incur time delays to detect critical events, which will negatively affect other processes [33][34][35].
ML has proved beneficial for different research directions in several research trends, including the security front and preserving human lives [36][37][38][39][40][41][42][43][44].More particularly, ML has also played a key role in mitigating the SMs' security issues.The flexibility of the proposed model in [45] is to adapt univariate and multivariate situations, conduct the selection of features, and contain static or dynamic impacts.In addition, it contains an intrinsic application for measurement and fulfillment, solving numerous basic shortcomings.Deep learning (DL) found its way in this regard.To address the drawbacks of existing methods, the authors in [46] introduced a novel class-balancing mechanism using the oversampling of the interquartile minority methodology and an integrated electrical theft detection (ETD) model.Long short-term memory (LSTM), UNet, and adaptive boosting (Adaboost) make up the combined ETD model, also known as LSTM-UNet-Adaboost.In this way, LSTM-UNet-Adaboost combines the benefits of ensemble learning (Adaboost) and deep learning (LSTM-UNet) for ETD.Additionally, the State Grid Corporation of China provided real-time SM information, which was used to simulate and assess the efficiency of the proposed method.
Another method focuses on the approach of using physical resources, in this case being Rechargeable Batteries and Renewable Energy Sources, i.e., a solar panel using the sun for its power source [47][48][49][50][51].This method was used because it did not need to distort the SM data, which means there will not be any time delays such as in the down-sampling method.However, the problem with this solution was that the incorporation of these physical resources would make the problem of privacy more complex and limited in scope while also generating massive costs as you would quickly need to replace the resources due to the wear and tear caused by the increased charging and discharging rate.
It is noted from the literature that there are many practical implementations of privacy in SMs, but further research and experimentation can potentially further increase efficiency and decrease the cost of implementation.By comparing and contrasting strengths and weaknesses within the presented models in the literature, this paper builds on this information to provide more robust ideas for privacy implementation in SMs and highlight any shortcomings.In addition, this survey examines the limitations of the exerted research works and solutions to their limitations.Table 1 lists the abbreviations utilized in the manuscript.It is worth mentioning that effective mitigation of the attack manipulations affecting SMs using modern technologies can alleviate the catastrophic consequences, whether on the level of human lives or on the infrastructure level [37,[52][53][54][55][56].The rest of the paper is organized as follows.Section 2 discusses the common existing problems facing the SMs' operation stability.Section 3 addresses the defense mechanisms of SMs.Then, a comparison of the existing solutions to SM problems is presented in Section 4. Afterward, significant recommendations for future directions to the interested scholars are provided in Section 5. Finally, the paper is concluded in Section 6.

Existing Problems of Smart Meters
In this section, we discuss the problems affecting SMs concerning several applications.Figure 1 shows some of the prominent applications supported by SMs.In this regard, SMs are prone to different attacks that falsify the reports delivered to decision makers.It can lead to catastrophic situations on both the level of human lives and utilities [57].Hereafter, we discuss the common vulnerabilities that SMs suffer from.

Optimal Power Flow
The objective of the optimum power flow model is to provide the optimum power output to meet consumer demand at the lowest possible cost of operation for the utility provider.This model can be used to determine locational marginal prices.Locational marginal prices show the generator costs, line costs, and the costs of specific nodes at specific times.Therefore, using the optimum power flow model, utility providers can calculate the exact cost of the noise that is being added to a particular regional area.In [58], the authors presented a mixed integer programming model to optimize the packet size and the inter-node distance in SG.The model presented the optimal distance that can be used between the communicating nodes to prolong the wireless-nodes-based SG lifetime.

Privacy Problems
Privacy issues are and have been one of the main problems of SMs [59][60][61][62][63].These SM devices send power consumption data back and forth from your home to the electricity company thousands of times per day, enabling real-time applications and analysis of data, e.g., energy theft prevention, monitoring of power quality, timely detection of faults, and demand response.This level of power consumption monitoring has an alarming concern for consumers, despite being a considerable benefit to the power companies in terms of the usability and reliability of SMs.It is not surprising that people may be concerned about privacy with these SMs around their homes and monitoring their daily activities.
To make sure the data from a consumer's SM is secure, the innovative DP-compliant algorithm was developed.In this privacy-preserving method, the consumer is guaranteed that by choosing between including or excluding their SM data from a data set, their chances of suffering negative impacts will not be disproportionately increased.How is this achieved?Differentially private systems are characterized by the fact that a query from one data set should never differ from the result of a query from a neighboring data set.Therefore, the consumer should be indifferent about whether their data are included in the database or not.

Defense Mechanisms of Smart Meters
This section addresses the proposed defense mechanisms of the SMs' security issues along with their limitations.The frequently used notations are summarized in Table 2.The utility of the ith player

A i
The action of the ith player

A
The actions of all players

N
The number of players a * The optimal action The strategies of all players except i

Differential Privacy
Energy retailers need to collect data about users' energy consumption to accurately meet load demand while minimizing costs.The problem is that energy consumption is highly personal and can be used to infer a lot of private information about a consumer.DP is a method of adding noise to individual consumers' load profiles to protect their privacy.
Here is how it works: In a large data set containing many users' load profiles, noise is added to every consumer's load profile based on the largest change that they could have on the result of a given query.The noise level to add to the consumer's load is calculated by first recording the result of a query to a data set that includes the consumer's information.Then, the same query is made to the same data set, except this time, the consumer's load is not included.Finally, the results of these two queries are compared to determine how much of an impact the consumer's data had on the aggregated result of the query.Noise is then added to the consumer's load profile based on its impact.This means that the same query for two separate data sets, one with the specified user's data and one without the user's data, should have the same result.Therefore, even if an adversary knew that a specific consumer's data was included in a data set, they would be unable to determine the user's consumption.Alternatively, given a data set and a user's load profile, an adversary would not be able to determine if that user was included in the data set.DP ensures privacy in that data sets can only be used to infer information about a group at large, but they cannot be used to infer information about any one individual.Therefore, consumers should theoretically be indifferent about volunteering their personal information.
That said, DP does not ensure complete privacy, it only guarantees that there is a max amount of privacy loss that an individual can suffer.Generalizations of the whole data set can still be used by an adversary to make statistical inferences about the user's consumption.However, the more users that are added to a data set, the less of an impact that any one individual can have on the data set.Therefore, statistical inferences about a user are harder to make on larger data sets.In addition, large data sets mean that less noise is required to anonymize an individual consumer's load profile.
When it comes to the DP algorithm, it should be assessed on its impacts in the four system operations it goes through: no privacy, low privacy, medium privacy, and high privacy.On the consumer end, an individual that would opt-in to the algorithm would then select one of the categories previously mentioned and from there be charged accordingly.In this case, the lower the epsilon value, the higher the security; therefore, the options increment to a lower epsilon, and the lower the epsilon, the more complex the calculation, requiring more computational power and creating a higher price baseline to equalize profit margins.The Laplacian noise, which aggregates in the load profile, fluctuates the profile significantly.When isolated, the fluctuations are very obvious and telling; however, when a composite of buses or consumers are added to the load profiles, it creates a smoothing effect on the data without losing any security.In some cases, smoothing functions may be applied due to the fact that the load profile with DP is immune to post-processing [64].The smoothing functions are applied to a time series to remove the fine-grained variations between time steps.Once and if a smoothing effect is applied, the noise sum is periodically stored in its respective server, either by a third party or directly to the retailer.As the raw SM data are sent out at whatever the tick rate that is seen best by the retailer, the noise signal is added to the raw data, which then becomes the secure SM reading.That secure reading and the database of noise for the individual consumers is then sent to the retailer, by whom the calculation of additional cost using the database of the recorded noise is stored.From there, the allocation of extra costs to the various consumers is processed and carried out by the retailer.

Machine Learning
In [65], the proposed model is concentrated on the short-term forecasting of residential buildings' air conditioning usage utilizing information from a traditional SM.Through energy disaggregation methods, the air conditioning load is isolated from the SMs' aggregate consumption at each time step.The calculated air conditioning demand and the associated historical weather information are then used as input characteristics for the forecast process.Different ML methods, such as ANNs, Support Vector Machines, and Random Forests, are employed in the prediction stage to make hourly and daily forecasts.In [25], the researchers went to the solution of the SM privacy problem by modeling the releaser (entity) of releasing data and the attacker as Recurrent Neural Networks (RNNs).It is appropriate for the time series of the SMs data, as depicted by Figure 2. RNN is a branch of neural networks that is capable of threading sequential data through the modeling of temporal correlation data.As such, an output of RNN at a time affects the output at a given time.More concretely, the authors exploited DNNs to adopt a DL model to take inputs to process into another layer of nodes.These nodes communicate their results with each other to form new layers until a final output is reached.In other words, the privacy-preserving model utilizes an RNN and Long Short-Term Memory.The algorithm is performed generally by a gradient descent utilizing the backpropagation algorithm.One issue with RNN is that learning conditional long-term time series data can lead to a lot of problems; hence, the LSTM cell was added.An LSTM cell creates four gating units to control and limit the flow of information, which all subsequently have sigmoid activation [66,67].Furthermore, the attacker is used to train the releaser mechanism while deploying two loss functions, one for each RNN, which are defined as:

Release
where γ ≥ 0, which controls the privacy-utility trade-off.α is the parameter of the "releaser" and β for the "attacker" parameter.There are some special cases this formula can run into, such as lambda being 0. However, the system still stands as estimated for an attacker to fail in concluding to address random estimation performance.For the adversary, the loss function would be defined as: In the given model the expectation is with respect to D t R t .It is important to note that the previous loss function is approximated by evaluating the expectation empirically.Here, the loss function of the releaser and attacker is approximated as follows: Since RNNs are usually executed by gradient descent relying on the backpropagation algorithm [68], it can result in an event called gradient vanishing/exploding problems that cause the training to be unsuccessful [69].The resolution of this problem is the use of an LSTM cell that includes four gating units to supervise the information flow.The parameter b is for biases, K is for input weights, and V is for recurrent weights, as depicted in Figure 3   This lets the user train models using sequences with several hundreds of time steps, which RNN struggles to do [70].In that model, an adversarial modeling framework is then created that consists of the two previously mentioned RNNs, to which a random variable will be appended to the observed signal Wt that randomizes the released data.For both networks, an LSTM architecture is selected; then, the training begins using the proposed algorithm to take the data, and the attacker will try to obtain the data to be tested for accuracy of 80%, almost no privacy, and 50% privacy.
These data will then be used for the addition of distortion to the released data based on the demand load and sensitive information because of the learned noise distribution to fit specific data based on the framework.This is a much better way than what is usually performed, which is to use a fixed noise distribution for all the SMs data.In the context of SMs, an adversary with access to distorted SM data as an input can utilize DNNs to accurately guess a homeowner's non-distorted SM readings.
In [71], measurements were categorized as secure or under attack using ML techniques.To overcome limitations brought on by the problem's sparse nature and to make use of any prior information about the system that may be accessible, an attack detection framework was offered in the suggested method.To represent the attack detection issue, well-known batch and online learning methods (supervised and semi-supervised) were combined with decision-and feature-level fusion.For meter data encryption, the authors in [72] suggested a localization-based key management system.Data were encrypted using a random key index and the key assigned to the meter's location.A dependable third party was in charge of maintaining and distributing the encryption keys.A technique based on received signal intensity and the highest likelihood estimator was suggested for the localization of the meter.At the control center, the packets were decrypted using a key that was mapped to the key index and meter coordinates.A demand-side management engine that was in charge of maintaining the effective use of energy based on priorities has been presented in [73].The control of intrusions in the SG was suggested using a unique resilient paradigm.Using the ML classifier, the resilient agent predicts dishonest entities.The ML has also proved beneficial in the detection of SG data falsification due to the attack injection, as presented in [74].

Kullback-Leibler Divergence
Kullback-Leibler (KL) divergence is used to interpret the information loss between expected and ground truth distribution.On the other hand, by the KL divergence, some samples produced by the model may not fit the data distribution.The attacker's aim is to reduce the KL divergence between the related predictors.In [25], the goal is to use DNNs to minimize the accuracy of the attacker's potential inferences while also minimizing the distortion rate between the released data (R T ) and the useful data (U T ).The attacker's goal is to most accurately infer the private data (D T ) with the following equation: In that work, Shateri et al. developed a sample output for each SM reading parameter before simulating an attacker's guessed DNN output, P D T |R T .After this, the goal is to determine the level of distortion needed and to utilize KL distance to measure the attacker's accuracy.The KL distance is a mathematical formula employed to calculate the gap between two distributions using the following formula: It is noted that the closer the output of this equation is to 0, the less privacy the DNN offers.

Game Theory
Game theory can be used to incentivize cooperation between parties to determine a fair cost allocation for the noise that is added to a system.Game theory is thus a sophisticated subset of intelligent optimization.The game theory model depicts a contest between groups of players who might choose to work cooperatively or antagonistically to advance their outcomes/payoffs via the employed strategy or strategies carried out by the progressive player actions.The essential game parameters definitions, as in [75,76], can be summed up as follows: • The strategic interaction between competing or cooperative interests when the limitations and compensation for actions are taken into account is referred to as a game.

•
A player is a fundamental component of a game.Each player in the game, given by the number i, is responsible for acting rationally, as shown by the symbol A i .In a game, a player may take the place of a human, a machine, or a team of players N.

•
The Utility/Payoff is expressed by the reward or punishment to a player based on a given action throughout the game given by u i : A → R, which calculates the output for the ith player, and pinpointed by the participating players actions A = × i∈N A i , where the symbol × represents a Cartesian product.• A strategy is defined by an action plan throughout the game in which a player can adopt a strategic game N, (A), (u i ) .
Game theory may be utilized in the security area to identify rogue nodes, mitigate the impact of foreign intrusions, and uncover nodes which act egotistically and overload the entire network.Nash equilibrium (NE), in general, is an intelligent approach to social issues that has emerged as a viable idea for wireless networks, and more particularly for wireless node security [75][76][77][78][79][80].
• NE is defined as the profile of the optimal action, a * ∈ A, as a player i ∈ N is not able to gain from unilaterally changing its course and opting for a different course of action [77,80].This process is reflected by the utility function as u i a * i , a * −i ≥ u i a i , a * −i for all a i ∈ A, where a i is the player strategy i and a −i represents the strategies of all players except i [75].
In [26], the authors assumed that consumers all have different privacy preferences and different values for how much they are willing to pay for their privacy.Researchers tested three different game theory functions to fairly allocate costs.The cost that must be distributed amongst a group is determined by the optimum power flow model, which can determine the overall energy consumption for a given region.Therefore, the groups that will split the costs are formed around their locational proximity to one another.A consumer may be responsible for more of the shared cost of noise depending on their chosen class of privacy.The first cost-allocating function tested is the Shapely value.The marginal cost is determined by calculating the additional cost incurred to a system by a consumer at the time that they join a given group.This cost is different depending on the consumer's entrance order to the group, so the cost the consumer pays is the average of all costs from all entrance orders.The second cost allocating function is the Vickery-Clark-Groves mechanism.This model incentivizes consumers to truthfully report their SM consumption after it has already been processed by the DP masking algorithm.Consumers are allocated costs based on their marginal contribution of noise to the system.The third cost allocation function is the nucleolus mechanism.This model finds the degree to which a group is dissatisfied with a given price allocation, and then attempts to minimize the dissatisfaction.The algorithm first calculates the most inequitable price allocation and then attempts to minimize the overall consumer dissatisfaction with it.Then, the next most inequitable price allocation is subsequently minimized.This process continues until a suitable vector is created, which can minimize consumer dissatisfaction with overall price allocations.
In [81], the authors studied how to use game theory models for protecting wireless nodes from selfish nodes or malicious nodes.In this study, we surveyed the different game-theoretic defense strategies for wireless nodes and presented a classification of the game theory approaches using the attack nature.Then, a trust model using game theory for decision making was presented, in which the significant role of evolutionary games for wireless nodes' security confronting intelligent attacks was identified.Finally, several prospective game theories were proposed to promote the data trustworthiness and cooperation of different wireless nodes.To prevent interrupting the reported data in clustered sensor networks, a Stackelberg game was created to counteract external assault manipulations using the energy defensive budget versus the corresponding attack budget, as proposed in [82].In [83], the proposed work can effectively resolve the hardware problem that occurs in the presence of the attack impact in sensor-network-based cognitive radio.In addition, the consumed energy is well managed using the proposed model.In this regard, refs.[84] presented a Stackelberg game in order to implement a security model for sensornetwork-based cognitive radio to confront the data falsification attack.This approach was developed for two different attack-defense scenarios.Two scenarios were presented relying on the threshold level of determining the interference power.In [85], an effective Stackelberg game was proposed in order to achieve data trustworthiness in PGN.This attack scenario regularly manipulates sets of the deployed nodes in the PGN, which cannot be managed using the previously provided technique.The proposed model was offered to combat the attack scenario, which is more serious than that taken into account in prior work.In [86], a game-theoretic protection approach was proposed for clustered wireless sensor networks based on a repeated game.The suggested approach was created to find rogue nodes that drop high-priority packets (HPPs) to increase the reliability of high-priority data (HPT).The findings of this study show that the suggested protection model's HPT is enhanced over a non-cooperative defensive mechanism, achieving the Pareto optimum HPT.In [87], a game-theoretic approach based on a non-zero-sum game is developed to build a robust trust model against intelligent attacks facing the IoT applications.The obtained results show enhanced performance in detecting the malicious nodes and to model low complexity.

Generative Adversarial Privacy
Shateri et al. suggest that SM data releasers can implement DNNs to combat this in a "minimax" game.This is a reference to a game theory concept in statistics in which one bases their own moves or decisions based on an opponent's possible decisions with the goal of minimizing the opponent's worst possible loss.Using DNNs to implement this minimax game is called generative adversarial privacy (GAP).In the case of SM privacy, a data releaser would use GAP to strategically add noise wherever necessary to give an adversary using DNNs to ensure an attacker's least accurate inference of the non-distorted SM data.The data releaser must decide where to add noise in an SM data set, and the extent of noise to add while also minimizing the loss of privacy and utility (privacy/utility tradeoff).This can be difficult without the use of DNNs due to the fact that adding noise to a data set sacrifices utility, yet adding too little compromises its privacy for an attacker.With the addition of DNNs and GAP, however, one can use as little noise addition as possible and still ensure adequate privacy [25].This study assumed that an attacker can eavesdrop on signals sent from a homeowner to a utility company to obtain released data and that the utility company knows what SM data need to be kept private to a consumer.The releaser refers to the actor that is authorized to read SM data, and an attacker refers to an eavesdropping adversary who attempts to infer the useful data based on the released signal.

Data Aggregation
Data aggregation, which takes multiple data from multiple SMs and organizes it into one big comprehensive medium, is another solution to the privacy problem.The limitation of this solution is that as more data are aggregated, the data become less accurate, thus degrading the positioning of the power measurements.Through the implementation of data aggregation, an additional privacy issue comes to the surface: each participating SM sees intermediate plain-text aggregation results routed through itself.This is because the intermediate SMs are authorized to achieve collected data decryption, adopt mathematical operations for aggregation tasks that are not capable to be executed over encryption.Data aggregation can resolve this issue by employing homomorphic encryption to enable secure in-network encryption and privacy protection.The electricity usage data from child SMs are encrypted with a semantically secure encryption methodology [88].At the same time, to enable aggregation functions, the algebraic operations of the plaintext are permitted to be carried out on the cipher domain.The main limitation this data aggregation method brings is a reasonable computation overhead [27].In [89], the authors used a linear integer program to represent the data aggregation scheduling problem.The problem was split into two different variations.The first model presupposed that the routing tree was established, for instance, through a network layer protocol, and that node broadcasts should be timed to minimize latency.The second variation required combined optimization of the node scheduling and routing topology creation.

Pseudonyms
Additional existing solutions include the use of pseudonyms rather than the real names of homeowners [90], standard encryption methods [91], the user of Deep Neural Networks (without the use of GAP as proposed by Shatiri et al.) [92], and random noise addition [93].Pseudonyms in smart meter data do not require substantial computational overhead, but many consider it a weak form of privacy due to attackers' abilities to link identities to their pseudonyms [90].Standard encryption methods have similar issues to data aggregation.The computational overhead of implementing strong encryption sacrifices utility [91].The use of DNNs requires low overhead and provides adequate privacy to homeowners, but Shatiri et al. attempt to build on this because privacy may be compromised if an attacker also uses DNNs [92].With random noise addition, a data releaser may add noise to SM readings in order to make these readings unintelligible to an attacker.A utility company can receive these readings and remove the randomly added noise to view a homeowner's raw SM data.This, however, introduces the dilemma of adding either too much noise and sacrificing utility or too little-sacrificing privacy [93].

Clustering
This technique adopts a hierarchical paradigm to confront the attack manipulations.In [94], SM data were disaggregated and utilized to understand patterns of energy usage using clustering techniques such as Fuzzy C-Means.In [95,96], the authors used a cluster ensemble of various clustering techniques to create an automated system for phishing detection.The hierarchical clustering (HC) Algorithm, which employed cosine similarity (using the TF-IDF metric for assessing the similarity between two points), and K-Medoids (KM) Clustering method were used as feature selection algorithms for extracting different phishing email attributes.To mitigate intrusions in wireless nodes, a hybrid method was put out.To specifically reduce the amount of information to analyze and the energy needed, a clustering approach is used [97].

Entropy
To determine the weight value, the information entropy theory, an objective weighing approach, examines the information order of each evaluation index.Entropy theory has been widely used in many fields, which has steadily raised the issue of adding entropy to the power system.The research done in [98], is intended to investigate information entropy data mining-based smart terminal security technology of the PG perception layer.This article examined similar techniques and creates a smart PG terminal.A safety strategy was created and a platform for data analysis was constructed on this foundation.In contrast to several different energy demand time series, the authors in [99,100] showed how to use a multidimensional anomaly detection technique for the early identification of manipulated electricity meters.The technique can improve and supplement current monitoring systems, which typically only assess one-time series.Since DET was the goal, there are obvious outliers as a result.To emphasize the requirements and fine-tuning procedures for the aggregation and comparison of various data sources, the model offered three data prepossessing techniques to generate outliers in the event of energy theft.

Fuzzy
To demonstrate the benefits of the fuzzy logic trust model, it was compared against an existing model in this research to identify untrusted nodes in SG networks.The routing effectiveness and detection rate for all classes of activities regarded as malicious can be increased using the suggested approach [101].To manage the power converter DC bus voltage, a fuzzy logic controller is used.The information gathered in [102] was cross-disciplinary, and none of the methodologies used have ever been offered in their entirety.To increase efficiency and take advantage of parallelism and high speed, all control functions were incorporated into a FPGA device using VHDL.The capability of the control functions was first demonstrated using an FPGA-in-the-loop simulation approach, and then experimental assessments were carried out to demonstrate equipment dependability and operation.

Bayesian
Utilizing two different types of readings from SMs and distribution-level phasor measurement devices (DPMDs), the authors in [103] created a method for harmonic state estimate DPMDs.Regression analysis was used to calculate the power flow, recurrent neural networks were used to anticipate demand, and sparse Bayesian learning was used to estimate the state.The suggested method was more compatible with current distribution grids since it needed fewer DPMDs than nodes.The efficiency of the suggested estimator was demonstrated by in-depth numerical simulations on an IEEE test feeder.To address the metrics of the physical layer (node transmission rate) and MAC layer (node buffering capacity) and determine trust at the node level for packet delivery, the Bayesian theory and Dempster-Shafer theory (BDST) were merged in [104].To manage metrics of the MAC layer (link capacity) and Network layer (distance and link quality) for computing confidence at the link level for protected routing, the fuzzy theory was also integrated with BDST.

Existing Solutions Comparison
When it comes to DP, its main benefits are its cost and benefit analysis, sequential composition, accurate billing, and easy implementation, really showing its strength and viability in a real-world application.Due to the fact that the consumer can choose a plan of privacy such as the four aforementioned (None, Low, Medium, High), a consumer can decide how much they are willing to pay for privacy and how important privacy is to the particular individual.Secondly, the sequential composition is used, but DP puts it ahead of neural networking in terms of speed due to the fact that sequential composition can consult the data more than once and run parallel composition in a broken down piece for higher security without losing out on speed of processing.Due to the fact that the processing is sent out periodically when the DP is reversed, the retailer and consumer can expect very accurate billing rather than a monthly or annual lump sum billing that is estimated.In addition, the consumer could accurately see what is running up their bill, so they can then effectively balance their lifestyles to effects billing expenses.Lastly, it is very easy to implement this system into modern-day homes, offices, etc.To integrate such an algorithm, all that is needed is communication and trust between the retailer and consumer keeping the business side of things simple and it is cheaper to implement and compute than neural networking.
While DP has its pros, it also carries its share of cons, the first being that it has no protection against inference privacy.Based on the data seen, even with noise an attacker could infer what could be running or not in one's household.Due to the algorithm allowing lower epsilon, higher security levels can create errors and uncertainties that could effectively cause more computation errors, throwing your billing prices off slightly or even creating an opening for an attacker to take advantage of your security due to an error.Lastly, as everything is now running through the DP algorithm or, in some cases, a third-party station, a greater energy loss is unavoidable.
When it comes to ANNs, the main benefits are its high precision with the amount of data that is returned while being very good at protecting against inference privacy as the data that can be seen by potential attackers are less than 50 percent, which guarantees privacy.The issue with this approach is that the training of this recurrent neural network can be very slow and complex.This problem becomes even worse as the use of long short-term memory requires more memory to train, which makes an already slow training process even slower.This also makes the training of the neural network more expensive, as the longer it takes, the more money it eats up, especially with all of the different technical parameters that are in this solution.Table 3 lists a comparison of the eleven considered defense strategies examined in the literature.

Game theory
The utility of the nodes is calculated when the association between nodes is analyzed as a cooperative and non-cooperative game.The game model addresses the logical issue involving the rational participants.

Complexity of implementation. Medium
Clustering The cluster's node rearrangement and network scalability.Easy to put into action.Very significant overheads for control.Certain protocols have a long transmission latency.Sophisticated algorithms.

Low Bayesian
The degree of confidence is taken into account while making decisions.
Scalable network design cannot be taken into account since assessment is solely focused on the node's QoS.Some samples produced by the model may not fit the data distribution.

Generative Adversarial Privacy
Easy to combine with machine learning, easy to interpret its generated data.
Oscillation of model's parameters leads to non-convergence, the generator can collapse.

Low Data Aggregation
Data aggregation aids in condensing information from various, dissimilar, and many sources.
If data are not gathered and organized meaningfully, they are difficult to identify and analyze.

Pseudonyms
Anonymity.Twice the identities.Low

Recommendations for Future Directions
Real-Time Data Release for SMs with privacy protection for different SG environments has become an inevitable concern that still needs a more reliable and adaptive solution.Accordingly, more extensive efforts are still desired that open the door for new intelligent solutions.Hereafter, we propose some recommendations for the interested scholars in this regard as follows: • Analyzing the impact an attacker could have on the data release framework if they have prior knowledge about their victim that was obtained without using smart meter data; • Combining their algorithm with physical distortion methods, such as renewable energy or batteries, could further increase a consumer's privacy by shaping their demand profile; • Observing the privacy impact of the DP algorithms when the resolution of the model, or the data collection interval, is over a specific time threshold; • Investigating the effects of privacy preservation against non-intrusive load monitoring techniques; • Providing more inference privacy techniques; • Considering the modern technologies and adaptive protocols of the Internet of Things to attain acceptable disaster management and risk mitigation.

Conclusions
Time series data, adversarial training, SG, air conditioning, and other applications can all benefit from the monitoring that SMs can provide.There have been several real-world SM privacy implementations in the literature, but further research and testing may be needed to increase effectiveness and decrease implementation costs.It is significant to mention that modern technology may effectively mitigate attack manipulations that affect SMs to lessen the devastating impacts on both infrastructure and human life.This article sheds light on the significance of the SM and offers academics interested in security and privacy a research pivot.We draw attention to the relevant SM characteristics in many applications.Electricity service interruptions are quite expensive: for example, the annual cost of the harm done to the U.S. economy by these interruptions is estimated to be between USD 104 billion and USD 164 billion.In addition, the data privacy loss of SM-based SG/PG networks is among the main targets of cyberattacks, leading to the deterioration of the networks.Accordingly, losing data privacy is significantly costly and rapidly degrading the national economy.Therefore, we concentrate on the SMs' weaknesses.Then, we take a look at eleven trust models used for SM security, which are among the widely used techniques for safeguarding the data privacy of the SMs' observed data in SG and PG networks.Then, we compare the current methods for protecting the data privacy of SMs.In addition, insightful suggestions are made for the interested researchers, taking into account the critical role that SM protection plays in catastrophe management, whether on the level of infrastructure or human life.

Figure 1 .
Figure 1.Framework of smart meter applications with attack existence.

Figure 3 .
Figure 3. LSTM diagram at a time t.

Table 1 .
List of abbreviations.