1. Introduction
Satellite images of earth are generated by imaging satellites, which may be operated by governments or enterprises. These images are captured through remote sensing (RS) technologies and, generally, RS can be described as the process of collecting and analyzing information about an entity, region or event without being in physical contact with it [
1]. RS data is considered a very useful source of information for many applications, such as land use classification, especially when integrated with artificial intelligence technology [
2,
3,
4]. The size of satellite images is increasing because of the growing demand for better resolution of images, and along related lines, the growing amount of RS data has enabled the study of various complex research topics [
5,
6,
7]. However, producing adequate RS images typically requires applying emerging DL-based techniques with complex architecture and computational workload. To provide this, many researchers use cloud computing platforms to apply DL techniques that enable them to extract insights and useful information. However, in such cases, data workflows can be subject to privacy concerns because of the public nature of that data processing and the tools used to manage it. Here, data privacy cannot be ensured, and data leakage may occur. However, there are still several benefits of using cloud computing, such hiding architecture complexity, cost savings, flexibility, and scalability. Thus, the optimal solution is one that can balance the downsides in order to reap the benefits. However, there are also several privacy challenges that require addressing, particularly in cases where satellite images are transmitted or stored using public DL techniques [
8]. Al-Rubaie and Chang provide an overview of the privacy-preservation deep learning (PPDL) techniques that can be adopted to safeguard the privacy of either individual or business users [
9]. From such work it becomes apparent that PPDL techniques can be used to benefit from public data analytics while also preventing data leakage and keeping sensitive information private from unauthorized access and illegal usage.
DL is commonly used to build predictive models for image processing and both speech and text recognition applications. These models are more precise, especially when trained on large data sets. Prediction is a method of studying available data and then using that expertise to produce new information that was not available before then. In many cases, though, these data also contain sensitive information that likewise requires preservation. Therefore, an important challenge here is to preserve the privacy of such data when they are sent to the public cloud for processing and analysis. In most cases, personnel computers lack the performance capabilities needed to process massive satellite images. Therefore, in order to extract useful knowledge and insights from such RS data, there is a greater need to perform big data analysis using public cloud servers. With this growing reliance on cloud services, the privacy of data collected and processed by cloud service providers during DL training is also becoming a more challenging concern [
10]. Satellite images could contain sensitive information, such as oilfield, airport, and military locations, that can be stolen and misused. Likewise, if such images are processed without protection, then this makes it easy for sensitive information to be extracted and used for illegal purposes. This indicates the need to find a reliable privacy method that will ensure that big satellite images are encrypted over cloud servers in ways that cannot be compromised. Thus, exploring PPDL techniques applied to satellite images becomes both a challenging topic and a potentially rewarding one.
The motivation driving this specific research project is to link two cutting-edge research topics, which are DL and privacy. Indeed, the progress of machine learning (ML) and its subfield of deep learning (DL) need not come at the expense of privacy or data security. Therefore, this research work proposes a powerful approach based on PPDL utilized on big satellite images in order to maintain anonymity and safeguard privacy related to data. Our main contribution is to apply PPDL for satellite images’ data, which, to the best of our knowledge, is an approach that has not been proposed or attempted anywhere in the literature. In particular, the contributions of this study are:
Proposing a PPDL technique, namely partially homomorphic encryption (PHE), for privacy-preserving satellite image classification.
Applying the PHE technique on a proposed DL-based CNN model and existing transfer learning models. (As our results later will demonstrate, this results in promising performance for two test cases.)
Testing the performance of the proposed PPDL technique on a real-world satellite image dataset.
The remainder of this paper is organized as follows:
Section 2 describes the theoretical background, specifically by introducing DL and PPDL as techniques. A review of related work is presented in
Section 3, while
Section 4 details the proposed method of using PPML for satellite image classification. Experimental results are reported and discussed in
Section 5. Concluding remarks and directions for future work are presented in
Section 6.
3. Related Works
This section reviews relevant research concerning PPDL. It also provides a comparison between previous research works and the current research.
Phong et al. [
39] have proposed an HE-based approach. In their work, the authors considered the main issues of the Shokri and Shmatikov system, which tends to leaking users’ local data to the cloud server as many users perform neural network-based DL. Phong et al. address this by building an enhanced DL system using additive HE. This system can prevent information leakage while still preserving accuracy. The results of Phong et al.’s work demonstrate that the system does not leak any user information to an honest-but-curious cloud service. Meanwhile, the use of HE adds a reasonable amount of overhead to the DL system.
Another solution based on data encryption using FHE was proposed by Vizitiu et al. [
40]. Here, the authors proposed to encrypt the input data and send it to the server to predict their results. Their approach took advantage of the MORE framework, which does not reveal patient records. The model performance was evaluated using medical imaging and the MINST dataset. Compared to the plain form, the experiment results indicated that Vizitiu et al.’s proposed solution achieves similar accuracy over the clinical dataset and approximately identical precision over the MNIST dataset. As for performance, the encrypted approach improved latency relative to unencrypted approaches, while as for privacy, the authors noted that while the MORE framework gives a certain degree of privacy, it is still susceptible to chosen plaintext attacks.
Wang and Chang [
41] also developed an approach based on DP techniques. The authors considered a two-party image classification issue where the data owners retain the images, and unreliable data users train the ML model with any of these images as input. Wang and Chang aimed to preserve data usability on image classification while at the same time maintaining data privacy. The authors proposed to use a randomized reply to perturb the image locally, which satisfies local DP. They also introduced DCA-Conv, a supervised image feature extractor, to manage the trade-off between usability and privacy. The results they achieved demonstrated that DCA-Conv can achieve a high degree of data usability while still maintaining privacy.
Abadi et al. [
42] have proposed another work based on DP. Here, the authors considered the training models problem, which may expose private and sensitive information in image datasets. Within the framework of DP, Abadi et al. introduced techniques to enhance the computational performance of DP, including computing gradients algorithms, dividing works into smaller batches, and applying differentially private principal projection at the input nodes. This experiment was conducted on a ML TensorFlow framework, and the results indicated that the training model can achieve relatively high efficiency, privacy protection, and model consistency.
Huang et al. [
33] have proposed a framework-based on the SS technique. These authors considered mobile sensing data protection and response time on cloud computing. Using an encryption-based secret exchange strategy, they developed a privacy-preserving CNN for feature extraction. Instead of cloud servers, the massive computational process was moved to edge servers in order to reduce delays between the cloud server and the mobile device. The results they achieved demonstrated the safety, efficacy, and reliability of their scheme based on theoretical analysis and empirical studies.
Another work based on the SS technique has been proposed by Ma et al., who considered the privacy of facial image data on cloud servers [
43]. They proposed an AdaBoost-based system for face recognition (POR), which was designed to protect users’ facial characteristics and the service providers’ privacy based on additive secret sharing techniques. This system consisted of two edge servers assigned to the complex POR computing operation. The authors enhanced the additive secret sharing-based technique features by increasing the efficient input domain. Through theoretical analysis, they demonstrated the consistency and security of the technique. The results of their experiment indicated a decrease in computational error as compared to the current differential privacy-based framework.
Xia et al. [
44] have proposed a scheme inspired by additive secret sharing techniques. They considered the encrypted image problem, which restricted the effectiveness of the image usage. They also proposed a set of additive protected computation protocols on numbers and equations with higher efficiency. With the assistance of their protocols, including the total operation of image classification in the unencrypted domain, they extracted CNN characteristics, reduced the dimension of characteristics, and generated the index safely. They also evaluated the execution of the suggested scheme in terms encryption reliability, recovery precision, and recovery efficiency using the Corel image dataset. The experiment’s results showed the higher reliability and efficiency of this new scheme.
Erkin et al. [
45] have proposed a framework based on the SMPC technique. The authors considered a case in which one party provides a facial image while the other party has access to a facial database, then introduced an extreme privacy-enhanced facial recognition framework that effectively protects both the input data and the server output that is running the matching function. The experiment’s results proved that the privacy-preserving framework is accurate, and also that it is possible to perform the protocol on modern hardware platforms.
According to the literature summarized here, we can conclude that many studies have used different PPDL techniques as well as diverse image datasets. However, as shown in
Table 2, none of these works discuss PPDL in satellite images data.
All previous studies do demonstrate the ability to maintain data privacy while preserving data usability when processing DL models, though. Thus, based on those works, our research will be evaluated on a proposed DL model, namely CNN, with a new dataset of real-world satellite images. Additionally, several experiments will be conducted using state-of-the-art DL transfer techniques to evaluate the performance of the method we propose.
4. Proposed Method
This section will describe the proposed method for PPDL in the specific case of satellite image classification. The proposed workflow, which is based on PHE and CNN, is illustrated in
Figure 3. Pre-processing, the data is encrypted on the client side with a public key and cannot be decrypted without knowing the private key. Therefore, only the encrypted data is accessible to the CNN-based model (cipher-images). Direct training on cipher-images data is achieved through the partially homographic feature of the Paillier encryption scheme; more details about PHE and Paillier scheme are presented in
Section 4.1. After the training process is completed, the model will undergo the testing phase with new cipher-images data, which are encrypted using the same public key as the training process: the proposed CNN-based model architecture is presented in
Section 4.2. Finally, since the cloud server operates directly on cipher-images data, data privacy is maintained during both training and testing. As a result, satellite image data processing is carried out securely and unauthorized parties cannot decipher the data.
4.1. The Proposed Encryption Method for Satellite Images
As previously mentioned in
Section 2.2.1, the HE technique has three sub-types that can be used for preserving data privacy. In this paper, the technique used for encrypting the data is PHE, which we selected because it does not require as much overhead required for executing computations. However, FHE requires a lattice-based cryptosystem, and it is not a realistic scheme either conceptually or technically. Especially in terms of computation, the bootstrapping section (which is the intermediate refreshing method of a processed ciphertext), is expensive and considerably more complicated than most other options. On the other hand, SHE allows for the performance of a limited number of sequential ciphertext multiplication and addition operations while PHE allows for an unbounded number of times [
16,
46].
The phases of the PHE technique will be described in
Section 4.1.1.
4.1.1. Partially Homomorphic Encryption Schemes
There are many PHE schemes, each of which enhances a particular aspect of PHE. The first achievement of the public key (PK) cryptosystem is Rivest-Shamir-Adleman (RSA). Public-key encryption is asymmetric key encryption, which is a type of algorithm that demands two different keys, one being private and unique to certain users while the other is public [
47]. The RSA scheme was established by Rivest, Shamir and Adleman (1977) as the first public-key cryptosystem for asymmetric-key encryption with the homomorphic property. It also defined the properties of multiplicative HE. However, strong security principles are not necessarily fulfilled here because, in order to accomplish semantic security, RSA must pad a message with random bits before encryption, which results in losing the homomorphic property [
48]. Since RSA does not fulfill strong security requirements, then, this research implementation is based on the Paillier scheme. The cryptosystem of Paillier is created by Pascal Paillier (1999) as a probabilistic asymmetric algorithm for public-key cryptography [
47]. The Paillier scheme has a homomorphic property, unlike RSA, and it is limited to addition. So, the property of additive HE is realized by the Paillier cryptosystem. Electronic voting is an implementation of additive HE. Each vote will be encrypted, and only the total will be decrypted [
32].
Paillier Scheme
The encryption scheme of Paillier is composed of three phases: key generation, encryption, and decryption, as depicted in Algorithms 1, 2, and 3 respectively [
32,
47].
The decryption of a Paillier scheme requires a cipher-text that is generated by the encryption process. The public key for encryption is (n, g) and the private key for decryption is (
,
).
Algorithm 1: Key generation (p, q) |
|
Algorithm 2: Encryption (m, pk) |
- 1
Choose a random integer ( between 1 and ) - 2
Compute the ciphertext as: )
|
Algorithm 3: Decryption |
Input: c Output:
- 1
Calculate the plaintext message as:
|
The Paillier cryptosystem is characterized by the following homomorphic properties [
47]:
Addition of plaintexts: the result of multiplying two ciphertexts would decrypt the sum of their respective plaintexts, as described in the following formula:
The ciphertext results through raising g to the plaintext would decrypt to the sum of their respective plaintexts, as described in the following formula:
4.2. Proposed Convolutional Neural Network (CNN)
As previously discussed, the main goal of this study is to ensure the privacy of satellite images when using public DL methods. Thus we propose to develop a custom CNN model and test its performance on satellite images encrypted using the proposed technique. The proposed CNN model is composed of the following layers: three convolution layers, three polling layers, a dropout layer, a flattening layer, two fully connected layers, and an activation function (ReLu, Softmax).
Table 3 presents the proposed CNN architecture.
Here the CNN was trained on encrypted data and passed through a stack of three convolution layers with 32 filters for the first and the second convolution and then 64 layers for the third convolution. The max-pooling layers that subsample the image by filters of 2 × 2 were placed after each convolution layer with 32, 32 and 64 filters, sequentially. The last two fully connected layers were loaded with 64 and 4 nodes sequentially, and the ReLu activation functions were utilized throughout the network, except for the last layer. For regularization, the dropout layer was utilized after the last pooling layer to prevent overfitting. A flattening layer was placed after the dropout layer and before the first fully connected layers in order to adjust the whole pooled feature map into one column. A Softmax activation function was placed in the last fully connected layer (output layer) to provide a class prediction.
Data Augmentation
Data augmentation is a technique for increasing the amount of data available for training the proposed CNN model without actually acquiring new data [
49]. This technique is used to expand the dataset into a larger one more appropriate for DL model training. There are various strategies used for data augmentation, including rotation, zoom, horizontal, and vertical shift. These techniques assist in enhancing the efficiency of CNNs [
50].
Different data augmentation strategies have been utilized in this research. For instance, we applied a 90-degree rotation range, a zoom and shear range of 20%, a brightness scale between 0.2 to 1.0 and a shift range of 20% in height and width. Finally, a horizontal flip and vertical flip have also been applied.
Table 4 shows the detailed parameters of our data augmentation processes
5. Experiments
This section describes the dataset we utilized as well as the environment in which our experiment took place. It will also present image encryption results, clarify the encryption schema’s efficiency, and examine the CNN model’s performance with both encrypted and plain data in order to evaluate the efficiency of the proposed encryption method. Experiments with pre-trained models will also be presented as another means of evaluating the efficiency of the proposed encryption method.
5.1. Dataset Description
In this study, experiments are conducted using satellite images produced by the French Satellite pour l’Observation de la Terre (SPOT) satellite. These satellite images were acquired using Spot 6 and Spot 7 with a spatial resolution of 2.5 m. They have been corrected both radiometrically and geometrically using ortho-rectification and spatial registration with sub-pixel accuracy and through close comparisons against a global reference system. Four land-cover types are identified in these regions, namely: urban, bare soil, vegetation, and road.
The dataset used in this paper is comprised of 37,774 images, as further illustrated in
Table 5.
The considered dataset is further divided into three datasets, resulting in 22,666 images for training the model (training set), 7554 images for validating the trained model (validation set), and 7554 images for assessing the model performance (testing set).
To obtain this dataset, a semantic segmentation is conducted using our previous works [
51,
52,
53]. The four classes—namely, urban, bare soil, vegetation and road—are extracted from the satellite images, meaning that the resulting images each contain both the real value of pixels of the extracted class and zero for the values of the other classes. Then, each image containing a given class is divided into non-overlapping blocks of 256 × 256 pixels and saved into folders, each with the name of the corresponding class. A sample of this dataset is depicted in
Figure 4, wherein the white represents a given land cover class and the black represents the values of other classes.
5.2. Experimental Set-Up
The hardware configuration and software used for the encryption process are:
Graphics processing unit: Intel Core i5-3210M (2.50 GHZ).
Memory: 4 GB.
Operating system: Windows 10 Professional.
Visual Studio Code with Python 3.9 extension.
These CNN experiments are conducted using the Google Colab repository, which allowed us to execute Python code through the browser and also provided access to NVIDIA graphics processing unit (GPUs). The libraries used to conduct these experiments are the Keras DL library and TensorFlow backend, a DL platform. The proposed model was trained using Stochastic Gradient Descent (SGD) optimizer with a learning rate of 0.001, 32 batch size and 100 epochs.
5.3. Experimental Results
5.3.1. Images Encryption
This section presents the results of image encryption we obtained using the Paillier scheme, or PHE. In addition, it also evaluates the efficiency of the encryption in terms of its reliability.
The Paillier encryption scheme enables researchers to train and test the CNN model without visual information. From this,
Figure 5 shows the results of image encryption for four samples representing the four-land cover classes: namely urban, vegetation, bare soil, and road. The original images were encrypted using the public key and decrypted using the private key. The measurement of encryption efficiency and security is a significant feature of image encryption scheme. Visual observation is appropriate in certain situations, but it does not indicate the amount of information hidden. Therefore, the correlation coefficient measurement has been utilized in order to evaluate the Paillier encryption scheme’s efficiency.
Security Evaluation
In this section, different security measurement has been utilized to evaluate the Paillier encryption scheme’s efficiency.
The correlation coefficient (CC) is used to measure the degree to which two variables are related. The CC value range is between −1.0 and 1.0. A negative correlation is represented by a correlation of −1.0, while a positive correlation is represented by a correlation of 1.0. The correlation coefficient is calculated using the following equation [
54,
55,
56,
57]:
where
x represents a plain-image, and y represents an encrypted image, and
,
are the mean of the plain-image and encrypted image, respectively.
An image cryptosystem is considered efficient if the encryption scheme covers all features of a plaintext image, while the encrypted image is also completely unpredictable and strongly uncorrelated. Therefore, the encryption scheme’s efficiency can be determined if the correlation coefficient between the encrypted image and plain-image close to zero or −1.
The CC result of the plain image and its corresponding encrypted image from the satellite image dataset is −0.0041, which is a negative relationship. This result indicates the efficiency of this encryption scheme where there are no visual features identified in the encrypted image.
Moreover, a number of other security parameters are evaluated, and the results are shown in
Table 6. From
Table 6, it is clear that the encrypted images are secure and an intruder cannot get any idea from the encrypted information. Ideally, entropy should be close to 8, and we can see from
Table 6 that the value of entropy for ciphertext is 7.9596, much higher than 3.12 for plaintext image. The higher value of contrast (10.57) indicates a secure image. Low energy and homogeneity values show that an encryption scheme is robust and highly secure. In our case, energy and homogeneity values are 0.0156 and 0.3884, respectively. These values are lower than for plain images. For MSE key sensitivity and unified average change intensity, higher values are required. From
Table 6, higher values are evident. Additionally, the lower value of peak signal to noise ratio and structural similarity index highlighted the security of the encrypted image.
5.3.2. CNN Performance
This section will present the performance of the custom CNN proposed in this study over both encrypted and plain data.
Data Augmentation
The first step in presenting the performance of this CNN model is to describe the results of our data augmentation processes. This latter operation is ensured by using the ImageDataGenerator function from the Keras deep learning library. The results obtained from our selected data augmentation techniques, including rotation, zoom, shear, height and width shift, brightness, and horizontal and vertical flip, are illustrated in
Figure 6. A sample image from the satellite dataset was used to demonstrate these results. As shown in
Figure 6a, the rotation results in pixels out of the images frame, leaving blank areas with no pixel details, while zoom augmentation in
Figure 6b results in making the images’ objects larger. As shown in
Figure 6c, shearing has been used to shift one part of the images, resulting in a parallelogram shape.
Figure 6d,e shows that all images’ pixels have been moved in one direction, either horizontally or vertically, while maintaining the original images’ dimensions. Furthermore, as shown in
Figure 6f, images have been randomly darkened or brightened to further augment the dataset. The images’ brightness enables the CNN models to generalize through trained images under varying lighting conditions. Data augmentation retains the features that are essential for predictions. As shown in
Figure 6g,h, the pixels are completely rearranged when flipping the images horizontally and vertically, but the features are retained.
5.4. Evaluation of the Privacy-Preserving
In order to evaluate the performance of the proposed privacy-preservation scheme, we also conducted additional experiments using pre-trained models. Here, four common pre-trained models (namely VGG16, Xception, ResNet50 and DenseNet121) were considered. Each was modified by fine-tuning the last layers, whereas the models’ previous layers were preserved frozen. We used the same batch size, initial learning rate, number of epochs and input image resolution for all models, as depicted in
Table 8.
A comparison of the CNN models’ classification accuracy, as obtained when trained on encrypted images, is presented in
Table 9. The results show good classification accuracy of all the pre-trained CNN models. As we note, DenseNet121 achieved the best accuracy among the four models, with an accuracy of 93.36% for training, 90.93% for validation and 90.5% for testing.
Figure 9 illustrates the training accuracy of the five CNN models: both the one we propose and the four pre-trained ones. We note that our proposed model provides better training accuracy compared to all four pre-trained CNN models.
Execution Time
All runtimes reported in this section were measured on the Google Colab repository with a CPU running at 2.30GHz.
Table 10 presents a detail of the runtime for each CNN model. The training runtime for VGG16, Xception, and ResNet50 is 22.27, 21.26, and 30.8 min, respectively. Additionally, the prediction runtime for these models is 0.803, 0.827, and 1.835 s, respectively. The training runtime for DenceNet121 and the proposed model is 18.34 and 16.43 min, and the prediction runtime is 4.134 and 0.976 s, respectively. Accordingly, the computation overhead varies from one model to another. However, PHE data are significantly fast during both training and prediction and therefore the computation overhead of the proposed encryption schema is low.
5.5. Discussion
Recent years have seen increasing concerns about protecting the privacy of confidential information when processing data using models. This leads to the need for cryptographic techniques to solve privacy concerns in data-driven models. Several PPDL techniques have been proposed in the literature to solve these concerns. This research is, to the best of our knowledge, the first work that investigates PPDL for satellite image classification.
In this study, we have proposed a PHE-based Paillier scheme as a means of preserving data privacy. This PHE scheme enables several operations to be performed directly on the encrypted data (cipher-images) without the need to access the unencrypted data (plain-images). Furthermore, the proposed encryption scheme offers high security, as measured with different security parameters. Because the data are encrypted with a PHE scheme, the images contain no identifiable information and thus do not reveal nothing sensitive. The capability of the Paillier encryption scheme in DL models was further demonstrated by tackling satellite image classification.
The performance of CNN models is highly dependent on the presence of a large dataset, which often constitutes a limitation in this research area since only comparatively small datasets are available. For example, the satellite dataset used in this research comprises 37,774 images divided into four classes. Therefore, this research has utilized data augmentation techniques to help increase the satellite dataset variety and thus improve the CNN model’s performance. An important observation here is that the PHE technique used in this study has ensured the privacy of data without compromising the classification accuracy of DL models. The results we have achieved show that the classification accuracy of different DL models is relatively close for both encrypted and plain data. Therefore, the proposed encryption scheme ensures good classification of satellite images while preserving the privacy and security of data included within these images.
6. Conclusions and Future Works
DL has become the core technology in many forms of data analysis. Therefore, various security threats and corresponding defensive PPDL techniques have attracted much attention in both the research community and in global interests such as military operations and business. With such an increased interest in processing satellite image data, there also comes a great demand for preserving privacy when using public DL technique for processing satellite images.
In this study, we proposed a PHE-based technique to protect sensitive information in satellite images when applying public DL models. To the best of our knowledge, this study constitutes the first research work that focused on PPDL as applied to satellite images. The encryption scheme developed in this research enables both the security of data and good classification accuracy. To evaluate the encrypted scheme’s efficiency, we conducted several experiments on both our custom model and several pre-trained CNN models. The results show a high level of efficiency for both the plain and encrypted data. In general, all models achieved good results in satellite image classification, but ours was the best by a slight margin. However, although our proposed PHE encryption scheme is efficient and provides good classification accuracy while preserving sensitive information within satellite images, several possible extensions can be considered in future work. We plan to apply the proposed approach to other datasets and test their performance. In addition, applying other PPDL techniques such as SS, SMPC, and DP, and evaluating their efficiency with the proposed method will be considered as a future perspective of this work. Moreover, we plan to explore the possibility of developing a hybrid technique that integrates more than one PPDL technique and then evaluate its performance, particularly in terms of privacy and classification accuracy.