Next Article in Journal
Assessing Decarbonization Approaches across Major Economies
Previous Article in Journal
Mitigating Voltage Drop and Excessive Step-Voltage Regulator Tap Operation in Distribution Networks Due to Electric Vehicle Fast Charging
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Research on Grid Multi-Source Survey Data Sharing Algorithm for Cross-Professional and Cross-Departmental Operations Collaboration

1
State Grid Economic and Technological Research Institute Co., Ltd., Beijing 102200, China
2
School of Geomatics and Urban Spatial Informatics, Beijing University of Civil Engineering and Architecture, Beijing 100044, China
*
Author to whom correspondence should be addressed.
Energies 2024, 17(17), 4380; https://doi.org/10.3390/en17174380
Submission received: 19 June 2024 / Revised: 17 August 2024 / Accepted: 30 August 2024 / Published: 1 September 2024
(This article belongs to the Section K: State-of-the-Art Energy Related Technologies)

Abstract

:
This paper addresses the problem of multi-source survey data sharing in power system engineering by proposing two improved methods: a survey data sharing method combined with differential privacy and a permission change method based on attribute encryption. The survey data sharing method integrated with differential privacy achieves effective cross-professional and cross-departmental data sharing while ensuring data security by introducing multi-discriminator architecture and dynamic noise adjustment. To reduce the computational and communication overhead when user permissions change during survey data sharing, the attribute encryption-based permission change method supports dynamic changes in user permissions. The effectiveness of the proposed methods has been validated through targeted experiments in different scenarios. The work in this thesis provides a new solution for dynamic sharing of survey data in power network engineering. It contributes to the digital transformation of power network projects.

1. Introduction

Data sharing facilitates the integration of disparate datasets, thereby enhancing the accessibility and utility of information. It accelerates the dissemination of knowledge and augments the interoperability of data. By standardizing data models and unifying service interfaces, dynamic and real-time sharing of multi-source survey data in power grid engineering can be achieved. This approach not only elevates the performance of data services but also diminishes the costs associated with collaborative communication. Consequently, it fosters the fluidity of data and professional collaboration within the realm of power grid surveying.
However, the dynamic sharing of data often encounters issues such as poor data security and high computational and communication costs. Scholars both domestically and internationally have conducted research and practice to address these challenges. In response to the issue of data security, papers [1,2,3] propose a searchable encryption algorithm that enables data search capabilities while maintaining encryption. However, cloud servers are often semi-trusted, presenting the problem of untrustworthy cloud services. Papers [4,5] introduce decentralized storage methods, effectively alleviating reliance on a single centralized server, but this leads to increased storage space requirements and heightened computational costs. To counter the lack of privacy in shared data, papers [6,7,8,9] propose a federated learning privacy protection algorithm. Nevertheless, federated learning necessitates multiple rounds of parameter transmission between clients and servers, resulting in high communication costs, especially under conditions of limited network bandwidth or a large number of participants. To address the high computational and communication costs in data sharing, Chen et al. [10] employ gradient compression techniques to mitigate the communication costs associated with gradient transmission. Papers [11,12,13] propose a CP-ABE algorithm to enhance the efficiency of data sharing, yet the downside is that a satisfactory balance between computational and communication costs and security has not been achieved.
Currently, research on data sharing in the field of power grid engineering predominantly focuses on the sharing of data related to power equipment [14,15,16,17], electricity consumption [18,19], power line losses [20], electricity trading [21], and the supply chain of power materials [22]. However, there is a notable absence of targeted research on data sharing in the domain of power grid surveying. The data sharing systems in this area are still at a rudimentary stage, merely applying pre-existing sharing algorithms without tailored improvements that align with the surveying business processes. This approach results in several issues, including weak algorithmic adaptability, data latency, and inadequate data security.
In the field of power grid surveying, it is necessary to survey and map the locations of substations, pylons, and line routes, etc., combined with topography and geomorphological and other geographical information to carry out planning, so it requires meteorological, geotechnical, and hydrological experts to collect and provide data. The existing method of operation is mostly based on surveyors going to the site to carry out mapping, then uploading the data to the system via a terminal device after all the data collection operations have been completed. This leads to invisibility of data between different disciplines and duplication of data collection. Due to the content of the survey data, the variety of data formats, and the poor security of the data delivery process, there are missing data. This makes it impossible for design, operations, and maintenance departments to understand the current overall situation of the area in real time, and forces the construction cycle to be extended.
Furthermore, as the survey industry continues to evolve, dynamic changes in privileges between the survey, operations, and design departments occur, and it is difficult for the existing system to cope with the ever-increasing computational and communication overheads caused by changes in privileges. For example, when the design department uses the data shared by the surveying department to perform a comprehensive analysis of topography, vegetation cover, geological hazards, etc. in combination with the parameters of electrical power routing, it is necessary to adjust the data operation rights of the users belonging to the surveying department to be read-only and unwritable, and the process of re-encrypting the data brings huge computational and communication overheads for a large number of users and a large amount of survey data.
This paper analyses the actual workflow of the survey business, combines the content and characteristics of the survey data, and puts forward corresponding solutions to the problems of data sharing in cross-disciplinary and cross-departmental business collaboration for power grid survey data. The contribution of this paper is summarized as follows:
(1)
Addressing the security of cross-professional inter-survey data sharing, we propose a novel method that integrates differential privacy with a generative adversarial network (GAN). Specifically, our approach extends the discriminator in the original GAN to multiple discriminators. Furthermore, a dynamic noise adjustment algorithm is designed to mitigate the impact of differential privacy noise on the utility of the shared data. Empirical results demonstrate that the proposed method effectively facilitates the sharing of surveying data while maintaining high data utility.
(2)
Addressing the problem of excessive computational and communication overheads due to dynamic changes in cross-departmental permissions, we introduce an attribute-based encryption (ABE) method for managing permission alterations. Within the ciphertext-policy ABE framework, the flexibility of managing user permissions is achieved through the updating of user private keys, thereby resolving the issue of excessive computational and communicational costs associated with dynamic user access permissions in the secure sharing of surveying data.

2. Grid Engineering Survey Data and Its Characteristics

In the realm of power grid engineering, surveying constitutes a pivotal preparatory phase for the design and construction of the grid. It encompasses a comprehensive investigation of key elements such as the alignment of transmission lines and the positioning of tower foundations, integrating geographical, geological, environmental, land, and resource assessments. The data procured through surveying in the context of power grid projects are referred to as power grid surveying data. These data are instrumental in comprehending, analyzing, and planning various facets of power grid engineering to ensure the safety, stability, and efficiency of the grid’s operation. Table 1 delineates the specific content of the power grid surveying data.
The characteristics of power grid surveying data are primarily characterized by the diversity in expressed content, data format complexity, and the multi-scale and multi-resolution.
(1)
Diversity in Expressed Content: The data from power grid surveying encompasses a wide array of information from various domains such as geography, geology, environment, land, and resources. It includes not only the alignment of transmission lines, precise coordinates of tower bases, and accurate elevations of the terrain, which are critical factors, but also data from geological exploration points.
(2)
Data Format Complexity: The surveying data for power grid engineering comprise a multitude of data formats, originating from various surveying methods and technologies. For instance, remote sensing imagery data is often stored in formats such as TIFF, PNG, JPG, GeoTIFF, IMG, GIF, and BMP. In contrast, geotechnical data may utilize formats like XML, HTML, JSON, YAML, and CSV. Furthermore, structured data, including sensor data and fundamental control survey information, might be present in formats such as TXT, DAT, BRN, CSV, or structured formats like XML, HTML, JSON, and YAML. Unstructured data, such as three-dimensional model data, could be in formats like CGR, DWG, DXF, DWF, DGNPLN, and RVT.
(3)
Multi-scale and Multi-resolution: From the perspective of data resolution, at the macro scale, data typically exhibit lower resolution, which is suitable for large-scale power grid layout and planning. For example, satellite remote sensing data is utilized to assess the topography and land use over vast areas. At the meso scale, data with higher resolution can display more detailed geographical features and environmental elements, which is applicable for the specific route selection and preliminary design of power grid lines. The micro scale, on the other hand, provides data with the highest resolution, such as high-precision topographic data obtained through ground surveys and LiDAR scanning, which aids in the precise positioning, design, and determination of construction details for tower bases.
In conclusion, the multiplicity of surveying data content and formats introduces considerable challenges to data sharing, profoundly impeding the circulation and dissemination of data across various departments and disciplines. Furthermore, the potential disclosure of sensitive geographical information poses significant security risks and could precipitate a crisis of confidence, thereby rendering data security and privacy protection one of the paramount challenges in the realm of power grid surveying for data sharing.

3. Research on Grid Survey Data Sharing Algorithm

The diversity and complexity of multi-source power network survey data across different disciplines and departments poses new challenges for data security and user rights management in data sharing. On the one hand, from the perspective of enhancing privacy and security, advanced encryption techniques are needed to protect the confidentiality and integrity of data in the sharing process. On the other hand, from the perspective of saving computational overhead, when the user’s demand for data sharing changes, the user’s original privileges should be revoked and updated to new privileges in a timely manner to meet the user’s specific requirements. Based on the above analysis, this paper conducts further research on differential privacy mechanism and ciphertext policy attribute-based encryption mechanism for the characteristics of grid survey data to achieve cross-discipline and cross-sector sharing of grid survey data.The flow of algorithmic research in this paper is shown in Figure 1.

3.1. Survey Data Sharing Methods Combining Differential Privacy

Differential privacy technology was proposed by Dwork [23] in 2004. Its fundamental idea is that when outputting information from a dataset, through rigorous mathematical proof, a method of random response is adopted. When the data is affected by a single record, it ensures that the impact is always below a certain threshold, thus preventing third parties from determining changes or additions of individual records based on changes in the output. Currently, in privacy protection methods, this approach is considered to be the most secure. For a random algorithm M, PM is the set of all possible outputs of algorithm M. If for any pair of neighboring datasets X and X’, and for any subset T of PM, the algorithm M satisfies:
P r [ M ( X ) T ] e ε Pr [ M ( X ) T ]
Then the algorithm M is said to satisfy differential privacy, where the parameter ε is the privacy preserving budget.
Differential privacy, by introducing an appropriate amount of noise, effectively protects the privacy of the original data without significantly affecting the accuracy of the data. This has been fully confirmed in the literature [24,25,26,27], demonstrating its outstanding performance in the protection of data security. Therefore, in response to the security issues of survey data sharing, this paper proposes a survey data sharing method combined with differential privacy. Considering that the noise introduced by differential privacy may have a negative impact on the validity of the data, this paper designs a dynamic noise adjustment algorithm, which reduces the impact of noise disturbance on shared data by dynamically adjusting the size of the noise during the training process.

3.1.1. Overview of the Methodology

The overall architecture of the method is shown in Figure 2. Data providers who each hold a discriminator D jointly train the generator G located on a semi-trusted cloud server. The survey data sharing method combined with differential privacy includes two stages: the model training phase and the shared data generation phase. In the model training phase, the cloud server and K data providers take turns to train the discriminator and the generator G. Specifically, each data provider uses a dynamic noise adjustment algorithm to update the discriminator weights after gradient disturbance, and the cloud server updates the generator weights using the discriminator feedback from the data providers that meet the requirements of differential privacy. In the shared data generation phase, latent vectors sampled from a Gaussian distribution are input into the well-trained generator, and the generator outputs shared data to the shared database for use by personnel from various departments.

3.1.2. Discriminator Feedback Construction Combining Differential Privacy

In the proposed method, the generator G produces shared data that can be indistinguishable from the actual surveying data, while the discriminator D is utilized to differentiate between the surveying data provided by the K data sharers and the shared data generated by the generator G. Based on the WGAN-GP model [28] and the Minimax strategy [29], the loss functions LG for the generator G and LD for the discriminator D, which account for the different data handled by G and D, are derived as follows:
L G = 1 K k = 1 K E Z ~ P ( Z ) [ D k ( G ( z ) ) ]
L D = E Z ~ p ( Z ) [ D K ( G ( Z ) ) ] E [ D K ( x k ) ) ] + E z ~ p ( z ) [ ( D K ( x k ) 1 ) 2 ]
In this method, E denotes the expected value of the loss function, Z represents the latent vector sampled from a Gaussian distribution, P(Z) denotes the probability density function from which Z is sampled according to the prior distribution, DK indicates the output of the discriminator for the K data sharer, and xk denotes the shared surveying data of the K sharer. The cloud server obtains the generator weights TG that minimize LG, and the data sharers obtain the discriminator weights TD that minimize LD. The generator G and the discriminator D are implemented by the WGAN-GP network, and the minimization of LG and LD is solved using stochastic gradient descent.
Considering that the intermediate results of model training can also lead to privacy leakage issues, the method first decomposes the gradient calculation of the generator G, then desensitizes the shared exploration data related to the discriminator D obtained from the decomposition.
The gradient of generator G is:
T G L G = 1 K K = 1 K i = 1 K G ( Z i ) D K ( G ( Z i ) ) × J T G G ( Z i )
In Equation (4), the first factor, denoted as G ( Z i ) D K ( G ( Z i ) ) , represents the feedback from the discriminator D, which is correlated with the discriminator weights TD. These weights, TD, are derived from the survey data provided by the data contributor and, as such, possess sensitive attributes. The second factor, denoted as J T G G ( Z i ) , can be computed based solely on the generator G and is independent of the discriminator weights TG, thereby lacking sensitive attributes. To clarify the exposition, this section elaborates on the desensitization process applied to the discriminator held by the data contributor. According to the post-processing properties of differential privacy [30], if the discriminator’s weights are updated under the constraints of differential privacy, ensuring that the feedback G ( Z i ) D K ( G ( Z i ) ) maintains the same differential privacy guarantees, then the generator weights TG updated by the cloud server using the discriminator’s feedback will also meet the requirements of differential privacy. The Adam optimizer is a commonly utilized optimizer in the WGAN-GP model. Consequently, this section introduces a dynamically noise-adjusted Adam optimizer designed to update the discriminator weights TD under the constraints of differential privacy, The details are shown in Algorithm 1.
Algorithm 1. Discriminator Weight Updates Combined with Differential Privacy
Inputs: discriminator weight TD, discriminator loss function LD, survey data data_x, generator synthesized shared data data_g, learning rate l_r, Privacy budget for differential privacy ep, Differential privacy sensitivity delta, first-order momentum estimation m, second-order momentum estimation v, threshold C, and Gaussian noise standard deviation S.
Initialization: m = 0 v = 0 sigma = sqrt (delta/(2×ep))
 for each iteration in training://Each iterative step in the training process
 //Calculate the loss for real and generated data
 loss_real = LD(data_x,TD)
 loss_fake = LD(data_g,TD)
 //Gradient calculation, the gradient function is used to calculate the gradient of the loss function with respect to the model parameters
 grad_real = gradient(loss_real, TD)
 grad_fake = gradient(loss_fake, TD)
 //Merge the gradients and compute the average gradient
 grad = (grad_real + grad_fake)/2
 //Updating the first- and second-order momentum estimates, beta1 and beta2 are the first- and second-order momentum parameters of Adam’s optimizer.
 m = beta1*m + (1−beta1)×grad
 v = beta2*v + (1-beta2)×(grad2)
 //Calculate the adaptive learning rate, t denotes the current number of iterations
 adaptive_lr = l_r×(sqrt(v/(1−beta2t)))
 //Noise is added according to differential privacy requirements, and the normal_noise function generates noise based on a Gaussian distribution
 Noise = normal_noise(mean = 0, std = S)
 //Updating discriminator weights while considering privacy-preserving noise
 TD = TD−adaptive_lr×(m + noise)
 //Updating the privacy budget
 ep = ep−delta
 //Stop updating if the privacy budget is depleted or less than the threshold C
 if epsilon < 0 or epsilon < C:
 break
return TD

3.1.3. Dynamic Noise Regulation

While safeguarding the privacy of survey data, the training of deep neural networks can be achieved by incorporating Gaussian noise into the gradients at each iteration, thereby ensuring differential privacy [31]. This approach meets the needs for privacy protection of shared data to a certain extent. However, maintaining a fixed Gaussian noise throughout the training process, although it can achieve differential privacy, the gradient perturbation introduced by the noise can lead to a reduction in the utility of the shared data. Consequently, this section introduces a dynamic noise adjustment algorithm that mitigates the degradation of shared data utility caused by the perturbation of differential privacy noise by dynamically adjusting the noise at each iteration of the training process.
The dynamic noise adjustment algorithm is inspired by the observed phenomenon that, at the initial stage of training, the L2 norm of the discriminator gradients is relatively large. As training progresses and the discriminator weights TD approach their optimal values, the L2 norm of the discriminator gradients gradually decreases, making them more sensitive to noise perturbations. To address this, the dynamic noise adjustment algorithm mitigates the impact of noise on the utility of shared data by adjusting the noise magnitude, denoted as Noise, according to a decay function that is specifically designed to reduce the noise-related perturbations as training advances:
Noise t = σ 0 1 + γ t
Here, γ ( 0 , 1 ) represents the decay rate, and t denotes the current iteration count of the discriminator. σ 0 is the initial scale of the noise. The dynamic noise adjustment algorithm reduces the likelihood of the discriminator’s weights being perturbed in the wrong direction as they approach their optimal values. Consequently, within a given privacy budget, it enhances the utility of the shared data ultimately obtained. Algorithm 2 provides the specific details of the dynamic noise adjustment algorithm.
Algorithm 2. Dynamic Noise Conditioning Algorithm
Input: Attenuation rate γ , Initial noise size σ 0 , Survey data data_sources.
Initialization: noise_scales = {source: σ 0 for source in data_sources}
 for source in data_sources://Iterate through each data source
 //Sample data from the current data source
 batch_data = sample_data(source)
 //Calculate the loss and gradient of the model on the current data
 loss = calculate_loss(model, batch_data)
 grad = calculate_gradient(loss, model.params)
 //Dynamically adjust the noise scale of the current data source according to the attenuation rate
 noise_scales [source] = noise_scales [source] × γ
 //Adding noise for differential privacy
 noise = normal_noise(mean = 0, std = noise_scales [source])
 noisy_grad = grad + noise
 return noisy_grad//Gradient after output adaptive perturbation

3.2. Attribute Encryption Based Permission Change Method

In order to solve the problem of excessive computation and communication overhead caused by the dynamic change of user access privileges, this paper proposes an attribute encryption-based privilege change method. The key is generated and managed by the authorization center (AC), which allows the data owner (DO) to define the access policy and encrypt the data based on the set of user privileges. When a user’s privileges change, the authorization center (AC) updates the user key to reflect the new set of privileges. After the user is authenticated, the encrypted data is decrypted using his or her key, thereby dynamically adjusting access rights without re-encrypting the entire shared dataset. This section describes the construction details of the attribute encryption-based permission change method.
The attribute encryption-based permission change method mainly uses the CP-ABE attribute encryption algorithm. The method is divided into six phases, which are system initialization phase, user key generation phase, data encryption phase, permission change phase, user authentication phase, and data decryption phase. Figure 3 illustrates the general framework of user privilege change, and the specific design of the above phases is discussed in the following.
(1)
System initialization phase: (PK,MSK)←Initialize(ɑ).
The system initialization phase is performed by the authorization center (AC). The authorization center generates a system master key and a corresponding attribute key for each permission used in the system. A security parameter, ɑ, is used to determine the security of the encryption algorithm. Execution steps: the AC selects the security parameter ɑ; the AC runs the system initialization algorithm Initialize(ɑ) to generate the global public key PK and the global private key MSK.
(2)
User key generation phase: USK←KeyGen (MSK,U).
The user key generation phase is performed by the authorization center (AC). Parameter setting: user’s attribute set U. To perform the step, the AC determines for each user its attribute set U. The AC generates the user key USK using the global private key MSK and the user’s attribute set U. The AC generates the user key USK using the global private key MSK and the user’s attribute set U.
(3)
Data encryption phase: ct←Encrypt (PK,M,Σ).
The data encryption phase is executed by the data owner (DO). Parameter setting: plaintext M. Access policy Σ, which defines a set of attributes of the logical relationship. Execution steps: DO defines the access policy Σ for the data. DO encrypts the plaintext M using the global public key PK and the access policy Σ.
(4)
Permission change phase: USK1←KeyGen (MSK,U1).
The permission change phase is executed by the authorization center (AC). Parameter setting: the user’s new set of privileges U1. Execution steps: When the user’s privileges are changed, the AC determines the user’s new set of attributes U1. The AC uses the global private key MSK and the user’s new set of attributes U1 to generate a new user key USK1.
(5)
User authentication phase: Auth←Verify (ID,U or U1).
The user verification phase is executed by the authentication server (AC). Parameter settings: user’s identity ID, user’s set of privileges U or U1. Execution steps: the user provides the identity ID and the set of attributes U or U1. The AC verifies the user’s identity and privileges to ensure that the user has the right to access the requested data.
(6)
Data decryption phase: M1←Decrypt (USK or USK1,ct).
The data decryption stage is executed by the data user (DU). Parameter settings: user key USK or updated user key USK1, ciphertext ct. Execution steps: The DU tries to decrypt the ciphertext ct using its user key USK or USK1. Decryption succeeds if the set of attributes of the user matches the encryption policy Σ, otherwise it fails.

4. Experiment

4.1. Experimental Configuration and Data Sources

The experiments were conducted on a computer with 16 GB RAM, AMD Ryzen 7 PRO 5845 CPU, NVIDIA T1000 graphics card, and Windows 10 operating system environments, and all algorithms were implemented in Python, with a programming environment of Python 3.9.

4.2. Experimental Situation

4.2.1. Parameter Settings

For the survey data sharing method incorporating differential privacy: the number of experimental cycles was 500, the batch size was set to 20, and the model was optimized for training using the Adam optimizer with a learning rate of 5 × 10−4. The exponential decay rate of the Adam optimizer was set to 0.5 and 0.99, respectively. The gradient penalty coefficient was set to 10, the clipping threshold C was set to 1, the noise in the adaptive noise perturbation was set to 1, the noise initial scale was set to 2, and the decay rate was set to 1.1 × 10−4.
For the attribute-based encryption method of privilege change, the access policy in the form of batch number increase was used for testing in the specific implementation of the methods involved in the comparison, where Att represents the attribute in the access policy, and its number was incremented by 10 each time. In the simulation experiments, 10 kinds of access strategies were selected, and each strategy repeated several independent tests, removing the highest value and the lowest value to take the average value of the remaining data as the experimental results of this test object.

4.2.2. Evaluation Index

The survey data sharing method combined with differential privacy: The performance of the method was evaluated using accuracy [31,32]. After multiple iterations, accuracy is a key indicator of the effectiveness of the method. By comparing the accuracy of different methods under the same conditions, the performance of the method can be judged intuitively.
The attribute-based encryption method for permission changes: Simulation experiments were conducted to test and compare the computation running time. The running time of the method is an important indicator for measuring computational overhead; the longer the running time, the greater the computational cost.

4.2.3. Experimental Results against Survey Data Sharing Methods Combining Differential Privacy

Comparison of Algorithm Performance with Different Number of Sharers

This section shows three typical training models trained on the shared dataset: Linear Regression (LR), Support Vector Machines (SVM), and Random Forest (RF).
The experimental data were obtained from four datasets collected during the actual operation of the survey of the 3-1 bidding section of the Ningxia section of the Ningxia-Hunan ± 800 kV UHV DC transmission line project in 2023, including; meteorological dataset, geotechnical dataset, hydrological dataset, and measurement dataset. Among them, the meteorological dataset contains air temperature data, wind direction and speed data, humidity data, etc.; the geotechnical dataset contains topographic and geomorphological data, rock mineral data, stratigraphic structure data, etc.; the hydrological dataset contains rainfall data, water quality data, and river level and flow data, etc.; and the measurement dataset contains airborne LiDAR raw point cloud data and digital aerial photography panchromatic image data.
In this experiment, the privacy budget was set to 1, and the number of data providers was set to 3. Three sharing algorithms, ATLAS [33], DP-CGANS [34], and DPGDAN [35], represent the current mainstream approaches to sharing data. As shown in Table 2, the shared data generated by the survey data sharing method combined with differential privacy better supported various predictive tasks. This is because the dynamic noise adjustment algorithm proposed in this paper allows for better generator weights when the discriminator weights are close to the optimal values, resulting in less noise disturbance in the discriminator feedback, thereby further enhancing the utility of the shared data.
In the power grid survey business, the surrounding terrain features and environmental conditions of a station site may change rapidly during the survey phase, such as when a substation is circling the site area. In order to ensure that all team members have access to the most up-to-date data and to maintain data consistency, it is important to ensure that all working group members work with the same dataset and quickly make data-based judgments by combining remotely sensed imagery, existing substation sites and line data. Therefore, it is necessary to consider the impact of different numbers of data providers on the stability of the method, and a comparison of the accuracy of the support vector machine model on shared datasets with different numbers of data providers needs to be performed. As shown in Figure 4, the survey data sharing method incorporating differential privacy consistently outperforms ATLAS [33], DPGDAN [34], and DP-CGANS [35] over the entire test range for different numbers of data providers.

Comparison of Algorithm Performance under Sharing between Different Professionals

Against the background of the shortening cycle of power grid survey, by real-time sharing of meteorological, geotechnical, and other professional data, potential engineering risks, such as geological hazards, extreme weather, etc., can be detected in time and corresponding measures can be taken, which can help to avoid reworking and delays due to the lagging of information, and thus control the project cost.
This experiment selected the geotechnical specialty and meteorological specialty under the survey department of the 3-1 bidding section of the Ningxia section of the Ningxia-Hunan ± 800 kV UHV DC transmission line project for inter-professional data sharing. The size of the shared dataset of geotechnical specialty was 6.75 GB, the data content was the geological situation near a preset tower, the data format was XML, CSV; the size of the shared dataset of hydrological specialty was 5.4 GB, the data content was the hydrological situation near a preset tower, the data format was PNG, XML, and JSON. The number of people who participated in the data sharing was 8 people, among which 4 people were from the hydrological specialty, 4 people were from geotechnical specialty, and 4 people were from the hydrological specialty. The initial value of the privacy budget was 1. The initial value of the privacy budget was 1. The security of the survey data sharing method combined with differential privacy was examined by adjusting the privacy budget value. The results are shown in Figure 5, under the setting of different privacy budgets, the survey data sharing method combined with differential privacy outperforms ATLAS [33], DPGDAN [34], and DP-CGANS [35] in most of the test ranges.

Comparison of Algorithm Performance under Sharing between Different Departments

With the large-scale construction of the power grid project, there are fewer and fewer line corridors with good natural conditions to choose from. Many long-distance ultra-high voltage (UHV) and ultra-high voltage (UHV) transmission lines under construction or planning have to pass through the central and western regions where the geological, meteorological and other natural environmental conditions are very complex. For the Ningxia-Hunan ± 800 kV UHV DC transmission line project, for example, the construction faced various complex engineering problems, including high winds, ice problems, landslides and mudflow geologic hazards, as well as the problem of underground hollowing areas, making the survey and engineering management difficult. If the efficient sharing of data between professions is realized, the design department could make more accurate design decisions based on real-time survey data and avoid design errors caused by outdated data.
This experiment selected the real-time data sharing between the survey department belonging to the 3-1 bidding section of the Ningxia section of the Ningxia-Hunan ± 800 kV UHV DC transmission line project and the design department belonging to the Zhongnan Architectural Design Institute in Wuhan, Hubei Province; the survey department selected the shared dataset as the geotechnical and hydrological dataset, with a dataset size of 27.6 GB, and the data format as XML, HTML, JSON, YAML, and CSV. The number of people involved in data sharing was 9, including 6 in the survey department and 3 in the design department. The initial value of the privacy budget was 1. The security of the survey data sharing method combining differential privacy was examined by adjusting the privacy budget value. The results are shown in Figure 6, with different privacy budgets set, the survey data sharing method incorporating differential privacy outperforms ATLAS [33], DPGDAN [34], and DP-CGANS [35] in the vast majority of the tested range.

4.2.4. Experimental Results of Attribute Encryption Based Permission Change Method

In order to verify the effectiveness of the attribute encryption based permission change method, the KMS-CP-ABE method in the reference [11], the MH-CP-ABE method in the reference [12], and the CP-ABE-CPRE method in the reference [13] are compared with this paper’s method in terms of time overhead, respectively.This experiment was aimed at the design department to use the survey department shared dataset to measure the section and optimize the scheduling work, in the process needing to adjust the survey department and the subordinate geotechnical, hydrological, surveying, meteorological professional data modification authority of the program. The experimental results are shown in Figure 7. From the experimental results, the attribute encryption-based permission change method can meet the actual needs of dynamic change of system permissions in terms of computational performance, and has a high efficiency, which can effectively deal with the frequent change of user permissions in the power grid survey business.

5. Conclusions

Addressing the pain points in the actual work process of power grid engineering surveys, this paper has designed a survey data sharing method integrated with differential privacy and a permission change method based on attribute-based encryption. The survey data sharing method with differential privacy leverages generative adversarial networks to extract data distribution characteristics. Additionally, since the noise disturbance introduced by differential privacy technology will inevitably lead to a decrease in the utility of shared data, to mitigate this impact, this paper has designed a dynamic noise disturbance algorithm that dynamically adjusts the scale of differential privacy noise during the training process. Experimental results show that the survey data sharing method with differential privacy can effectively achieve multi-source survey data sharing; the permission change method based on attribute-based encryption solves the problem of excessive computational and communication overhead caused by dynamic changes in user access permissions in the secure sharing of power grid survey data, and the method supports data decryption outsourcing to reduce the decryption computational overhead for users. Compared to existing similar schemes, the newly constructed scheme has advantages in the application scenario of power grid survey business, and the experimental results also demonstrate its efficiency in practical applications.
In the future, it is planned to further improve the generalizability of the algorithms presented in this paper, enabling better or more cost-effective sharing of other types of survey business data; at the same time, it is also planned to combine with various construction units of power grid projects to study cross-unit data sharing and permission changes, further expanding the digital methods of power grid survey.

Author Contributions

J.Z.: Conceptualization, methodology, writing—original draft preparation; B.H.: methodology, writing—original draft preparation; J.L.: supervision, writing—review; C.Z.: data curation, writing—editing; G.Y.: data curation, writing—editing; D.L.: validation, funding acquisition, data curation, writing—editing. All authors have read and agreed to the published version of the manuscript.

Funding

This study was supported by the Science and Technology Project of State Grid Corporation of China: Research and Application of Multi-source Collaborative and Dynamic Data Sharing Technology for Power Grid Engineering Survey Data (5700-202356317A- 1-1-ZN).

Data Availability Statement

Data are contained within the article.

Conflicts of Interest

Authors Jiyong Zhang, Chunhui Zhao, Gao Yu and Donghui Liu were employed by the company State Grid Economic and Technological Research Institute Co., Ltd. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

  1. Wang, M.; Rui, L.; Xu, S.; Gao, Z.; Liu, H.; Guo, S. A multi-keyword searchable encryption sensitive data trusted sharing scheme in multi-user scenario. Comput. Netw. 2023, 237, 110045. [Google Scholar] [CrossRef]
  2. Liu, Z.; Li, T.; Li, P.; Jia, C.; Li, J. Verifiable searchable encryption with aggregate keys for data sharing system. Futur. Gener. Comput. Syst. 2018, 78, 778–788. [Google Scholar] [CrossRef]
  3. Niu, S.; Yang, P.; Xie, Y.; Du, X. Cloud-assisted ciphertext policy attribute-based data sharing encryption scheme on blockchain. J. Electron. Inf. 2021, 43, 1864–1871. [Google Scholar]
  4. Jiang, L.; Qin, Z. An efficient decentralized mobile groupwise data sharing scheme based on attribute hiding. J. Univ. Electron. Sci. Technol. 2023, 52, 915–924. [Google Scholar]
  5. Tian, G.; Hu, Y.; Wei, J.; Liu, Z.; Huang, X.; Chen, X.; Susilo, W. Blockchain-based secure deduplication and shared auditing in decentralized storage. IEEE Trans. Dependable Secur. Comput. 2021, 19, 3941–3954. [Google Scholar] [CrossRef]
  6. Xu, Y.; Mao, Y.; Li, S.; Li, J.; Chen, X. Privacy-Preserving Federal Learning Chain for Internet of Things. IEEE Internet Things J. 2023, 10, 18364–18374. [Google Scholar] [CrossRef]
  7. Yin, L.; Feng, J.; Xun, H.; Sun, Z.; Cheng, X. A privacy-preserving federated learning for multiparty data sharing in social IoTs. IEEE Trans. Netw. Sci. Eng. 2021, 8, 2706–2718. [Google Scholar] [CrossRef]
  8. Huang, L.; Yi, W.; Wang, Y.; Cha, D. Research on secure data sharing method for sea-rail transportation based on federated learning and multi-party secure computing. Railw. Transp. Econ. 2024, 46, 58–67. [Google Scholar] [CrossRef]
  9. Chen, J.; Peng, C.; Tan, W. A design scheme for user profiling based on federated learning with multi-source data. J. Nanjing Univ. Posts Telecommun. (Nat. Sci. Ed.) 2023, 43, 83–91. [Google Scholar] [CrossRef]
  10. Chen, L.; Xiao, D.; Yu, Z.; Huang, H.; Li, M. Efficient federated learning for communication based on secret sharing and compressed sensing. Comput. Res. Dev. 2022, 59, 2395–2407. [Google Scholar]
  11. Ren, Z.; Yan, E.; Chen, T.; Yu, Y. Blockchain-based CP-ABE data sharing and privacy-preserving scheme using distributed KMS and zero-knowledge proof. J. King Saud Univ.—Comput. Inf. Sci. 2024, 36, 101969. [Google Scholar] [CrossRef]
  12. Zhang, X.; Yao, Y.; Fu, J.; Xie, H. Policy-hiding efficient multi-authorized organization CP-ABE data sharing scheme for Internet of Things. Comput. Res. Dev. 2023, 60, 2193–2202. [Google Scholar]
  13. Zhao, K.; Kang, P.; Liu, B.; Guo, Z.; Feng, C.; Qing, Y. A CP-ABE scheme supporting cloud proxy re-encryption. J. Electron. 2023, 51, 728–735. [Google Scholar]
  14. Liu, C.; Zhang, Q.; Li, Y.; Zhang, H. Efficient storage and sharing algorithm for power information based on fog computing. J. Shenyang Univ. Technol. 2024, 46, 1–6. [Google Scholar]
  15. Guo, F.; Liu, S.; Wu, X.; Chen, B.; Zhang, W.; Ge, Q. Fault diagnosis of power transformer with unbalanced sample data based on federated learning. Power Syst. Autom. 2023, 47, 145–152. [Google Scholar]
  16. Qin, S.; Dai, W.; Zeng, H.; Gu, X. Research on secure data sharing of electric power application based on blockchain. Inf. Netw. Secur. 2023, 23, 52–65. [Google Scholar]
  17. Deng, S.; Hu, Q.; Wu, D.; He, Y. BCTC-KSM: A blockchain-assisted threshold cryptography for key security management in power IoT data sharing. Comput. Electr. Eng. 2023, 108, 108666. [Google Scholar] [CrossRef]
  18. Yang, X.; Liao, Z.; Liu, L.; Wang, C. Power data sharing scheme based on blockchain and attribute-based encryption. Power Syst. Prot. Control 2023, 51, 169–176. [Google Scholar] [CrossRef]
  19. Zhang, H.; Ding, P.; Peng, Y.; Sun, C. State Grid Electricity Data Sharing Program Based on CKKS and CP-ABE. Inf. Secur. Res. 2023, 9, 262–270. [Google Scholar]
  20. Xiang, Y.; Yang, L.; Chen, B.; Li, G. Research on power line loss data sharing based on differential privacy protection. Comput. Appl. Softw. 2023, 40, 333–336+341. [Google Scholar]
  21. Wang, B.; Guo, Q.; Yu, Y. Mechanism design for data sharing: An electricity retail perspective. Appl. Energy 2022, 314, 118871. [Google Scholar] [CrossRef]
  22. Song, J.; Yang, Y.; Mei, J.; Zhou, G.; Qiu, W.; Wang, Y.; Xu, L.; Liu, Y.; Jiang, J.; Chu, Z.; et al. Proxy re-encryption-based traceability and sharing mechanism of the power material data in blockchain environment. Energies 2022, 15, 2570. [Google Scholar] [CrossRef]
  23. Erlingsson, Ú.; Pihur, V.; Korolova, A. Rappor: Randomized aggregatable privacy-preserving ordinal response. In Proceedings of the 2014 ACM SIGSAC Conference on Computer and Communications Security, Scottsdale, AR, USA, 3–7 November 2014; pp. 1054–1067. [Google Scholar]
  24. Jiang, W.; Chen, Y.; Han, Y.; Wu, Y.; Zhou, W.; Wang, H. A privacy-preserving approach for mix-and-shuffle differentials during K-Modes clustering data collection and distribution. J. Commun. 2024, 45, 201–213. [Google Scholar]
  25. Fan, H.; Xu, W.; Fan, X.; Wang, Y. Analysis and outlook of the application of privacy computing in new power systems. Power Syst. Autom. 2023, 47, 187–199. [Google Scholar]
  26. Yu, H.; Liang, Y.; Song, J.; Li, h.; Xi, X.; Yuan, J. Overview of the development of data security sharing technology and its application in the field of energy and electric power. Inf. Secur. Res. 2023, 9, 208–219. [Google Scholar]
  27. Sadeghi, P.; Korki, M. Offset-symmetric Gaussians for differential privacy. IEEE Trans. Inf. Forensics Secur. 2022, 17, 2394–2409. [Google Scholar] [CrossRef]
  28. Gulrajani, I.; Ahmed, F.; Arjovsky, M.; Dumoulin, V.; Courville, A.C. Improved training of wasserstein gans. In Advances in Neural Information Processing Systems; NeurIPS: La Jolla, CA, USA, 2017; p. 30. [Google Scholar]
  29. Wang, Y.; Ren, T.; Fan, Z. UAV air combat maneuver decision making based on bootstrap Minimax-DDQN. Comput. Appl. 2023, 43, 2636–2643. [Google Scholar]
  30. Zhao, Y.; Yang, M. A review of progress in differential privacy research. Comput. Sci. 2023, 50, 65–276. [Google Scholar]
  31. Xie, L.; Lin, K.; Wang, S.; Wang, F.; Zhou, J. Differentially private generative adversarial network. arXiv 2018, arXiv:1802.06739. [Google Scholar]
  32. Xu, L.; Skoularidou, M.; Cuesta-Infante, A.; Veeramachaneni, K. Modeling Tabular data using Conditional GAN. arXiv 2019, arXiv:1907.00503. [Google Scholar]
  33. Wang, Z.; Cheng, X.; Su, S.; Liang, J.; Yang, H. ATLAS: GAN-Based Differentially Private Multi-Party Data Sharing. IEEE Trans. Big Data 2023, 9, 1225–1237. [Google Scholar] [CrossRef]
  34. Wang, Z.; Cheng, X.; Su, S.; Wang, G. Differentially private generative decomposed adversarial network for vertically partitioned data sharing. Inf. Sci. 2023, 619, 722–744. [Google Scholar] [CrossRef]
  35. Sun, C.; van Soest, J.; Dumontier, M. Generating synthetic personal health data using conditional generative adversarial networks combining with differential privacy. J. Biomed. Inform. 2023, 143, 104404. [Google Scholar] [CrossRef] [PubMed]
Figure 1. Research flowchart of grid survey data sharing algorithm.
Figure 1. Research flowchart of grid survey data sharing algorithm.
Energies 17 04380 g001
Figure 2. Schematic diagram of survey data sharing methods incorporating differential privacy.
Figure 2. Schematic diagram of survey data sharing methods incorporating differential privacy.
Energies 17 04380 g002
Figure 3. Attribute encryption-based permission change methodology framework.
Figure 3. Attribute encryption-based permission change methodology framework.
Energies 17 04380 g003
Figure 4. Algorithm performance under different data sharers on shared dataset.
Figure 4. Algorithm performance under different data sharers on shared dataset.
Energies 17 04380 g004
Figure 5. Performance of the algorithm under sharing among different specialties on the shared dataset.
Figure 5. Performance of the algorithm under sharing among different specialties on the shared dataset.
Energies 17 04380 g005
Figure 6. Algorithm performance under sharing between different departments on a shared dataset.
Figure 6. Algorithm performance under sharing between different departments on a shared dataset.
Energies 17 04380 g006
Figure 7. Computation overhead comparison.
Figure 7. Computation overhead comparison.
Energies 17 04380 g007
Table 1. Data content of electrical network engineering surveys.
Table 1. Data content of electrical network engineering surveys.
NameContentFormatStructured vs. UnstructuredReal-Time vs. Non-Real-Time
Image dataIncluding remote sensing data, aerial data, laser point cloud data, etc.TIFF, PNG, JPG, GeoTiff, IMG, GIF, BMPUnstructured/real-time
Sensors dataIncludes pressure sensor data, radar sensor data, humidity sensor data, etc.TXT, DAT, BRN, CSVStructured, unstructuredreal-time
Basic control measurement dataBasic control measurement information element attribute informationXML, HTML, JSON, YAML, CSVStructuredreal-time
Geotechnical dataAttribute information of exploration data elements of exploration points, etc.XML, HTML, JSON, YAML, CSVStructuredreal-time
3D modeling dataThree-dimensional modeling data of power grid engineering facilities and the surrounding environmentCGR, DWG, DXF, DWF, DGNPLN, RVTUnstructurednon-real-time
Table 2. Comparison of program accuracy (%).
Table 2. Comparison of program accuracy (%).
ModelATLAS [33]DP-CGANS [34]DPGDAN [35]Our
LR0.78880.73030.72620.8547
SVM0.77620.72350.70610.8426
RF0.77480.73120.71330.8219
AVG0.77990.72830.71520.8397
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Zhang, J.; He, B.; Lv, J.; Zhao, C.; Yu, G.; Liu, D. Research on Grid Multi-Source Survey Data Sharing Algorithm for Cross-Professional and Cross-Departmental Operations Collaboration. Energies 2024, 17, 4380. https://doi.org/10.3390/en17174380

AMA Style

Zhang J, He B, Lv J, Zhao C, Yu G, Liu D. Research on Grid Multi-Source Survey Data Sharing Algorithm for Cross-Professional and Cross-Departmental Operations Collaboration. Energies. 2024; 17(17):4380. https://doi.org/10.3390/en17174380

Chicago/Turabian Style

Zhang, Jiyong, Bangzheng He, Jingguo Lv, Chunhui Zhao, Gao Yu, and Donghui Liu. 2024. "Research on Grid Multi-Source Survey Data Sharing Algorithm for Cross-Professional and Cross-Departmental Operations Collaboration" Energies 17, no. 17: 4380. https://doi.org/10.3390/en17174380

APA Style

Zhang, J., He, B., Lv, J., Zhao, C., Yu, G., & Liu, D. (2024). Research on Grid Multi-Source Survey Data Sharing Algorithm for Cross-Professional and Cross-Departmental Operations Collaboration. Energies, 17(17), 4380. https://doi.org/10.3390/en17174380

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop