Privacy-Preserving Image Template Sharing Using Contrastive Learning

With the recent developments of Machine Learning as a Service (MLaaS), various privacy concerns have been raised. Having access to the user’s data, an adversary can design attacks with different objectives, namely, reconstruction or attribute inference attacks. In this paper, we propose two different training frameworks for an image classification task while preserving user data privacy against the two aforementioned attacks. In both frameworks, an encoder is trained with contrastive loss, providing a superior utility-privacy trade-off. In the reconstruction attack scenario, a supervised contrastive loss was employed to provide maximal discrimination for the targeted classification task. The encoded features are further perturbed using the obfuscator module to remove all redundant information. Moreover, the obfuscator module is jointly trained with a classifier to minimize the correlation between private feature representation and original data while retaining the model utility for the classification. For the attribute inference attack, we aim to provide a representation of data that is independent of the sensitive attribute. Therefore, the encoder is trained with supervised and private contrastive loss. Furthermore, an obfuscator module is trained in an adversarial manner to preserve the privacy of sensitive attributes while maintaining the classification performance on the target attribute. The reported results on the CelebA dataset validate the effectiveness of the proposed frameworks.


Introduction
Deep learning has been widely applied in many computer vision applications in recent years, with remarkable success. Much progress in deep learning has been made possible thanks to accessible computational power and the widely available datasets needed for training. The necessity of memory and computational power has incentivized many companies such as AMAZON, Google, and IBM to provide their customers with platforms offering Machine Learning as a Service (MLaaS). MLaaS runs on a cloud environment and covers most infrastructure issues such as data pre-processing, model training, and model evaluation. Hence, the users can deploy their machine learning models by simply uploading their data (e.g., images) into the cloud server.
With all the promises made by MLaaS, this scheme introduces various privacy challenges for both users and the service provider. From one point of view, the service providers are concerned that an adversary could be disguised as a client to steal their model parameters. On the other hand, users are worried that sensitive information might be revealed to unauthorized third parties by uploading their raw data into the cloud server [1]. Furthermore, in some financial or medical data applications, it might not be legally allowed for the user to upload and submit raw data to the cloud server. One widely used solution is to share a feature representation of data instead. However, the adversary can still exploit the privacy leakage in the feature representation and design attacks targeting various objectives.
There are mainly two types of attacks regarding the privacy of users' data: attribute inference attack and reconstruction attack [1,2]. In the reconstruction or model inversion attack, the adversary's goal is to reconstruct the original data given the shared feature representation. Whereas in attribute inference attack, the adversary is interested in identifying certain sensitive attributes in the data such as age, gender, race, etc.
In this paper, we consider an image classification task in which users send their original data to the cloud service provider. The adversary, a malicious user or the MLaaS provider, wishes to exploit the privacy leakage in the shared feature representation targeting reconstruction or attribute inference attack.
The rest of the paper is organized as follows: Problem formulation and assumptions are introduced in Section 2. Section 3 reviews the related work. Two defense frameworks against the reconstruction attack and attribute inference attack are proposed in Sections 4 and 5, respectively. Finally, Section 6 concludes this work along with suggestions for future work.

Problem Formulation
As shown in Figure 1, given the high dimensional images in the dataset x ∈ R n , users or data owners intend to share a feature representation h for the specific utility task, image classification. Let Y t denote the corresponding labels for the target class that the central classifier is trained to predict them and let Y p denote the label information for the private and sensitive attribute. Concerned about the privacy leakage in the shared representations, the users, as the defenders, apply an obfuscation mechanism on the shared features before releasing them to the public as h p . The defender's ultimate goal is to maintain a good classification performance while preserving their privacy.
On the other hand, having access to a collection of original images and their corre- the adversary aims to reconstruct the original data or recognize sensitive attributes such as age, gender, etc. Therefore, in this setting, the utility is a classification task and privacy is defined as the attacker's ability to reconstruct the original data or re-identify the sensitive attributes.  Eavesdropping on the private features, the adversary wishes to reconstruct the original data or infer sensitive attributes. The adversary does not have access to the local obfuscation mechanism used by the user, shown in blue dashed lines.

Related Work
Several techniques have been introduced to preserve the users' data privacy, such as image obfuscation, homomorphic encryption, secure multi-party computation, and private feature representation.
Classical image obfuscation: In image obfuscation techniques, the original image is perturbed to hide sensitive information or details and make it visually unidentifiable. Conventional methods include pixelating [3], blurring [3,4], and masking [5]. However, as discussed in [6,7], these protected images can still be identified or reconstructed using deep learning-assisted methods. Recently, more advanced frameworks of deep obfuscation based on deep generative models have been introduced [8][9][10].
Homomorphic encryption: Homomorphic encryption (HE) is another method that allows one to carry out computations on encrypted data without the need for decryption [11]. This means that data can be processed securely even though they have been outsourced in untrusted and public environments. HE can be categorized into three types, namely partially homomorphic (PHE), somewhat homomorphic (SWHE), and fully homomorphic encryption (FHE) [11]. However, the operations in HE are limited to be represented as a polynomial of a bounded degree. They cannot, therefore, be used with complicated and nonlinear computation functions. Moreover, HE is highly computationally intensive and leads to an extremely slow training process.
Deep and private feature sharing: With the recent advancements of deep models, a new line of work has been introduced to share deep private and obfuscated feature representations of images. Osia et al. [12] considered a client-server setting in which the deep model architecture is separated into two parts: a feature extractor on the client's side and a classifier on the cloud. The extracted features are then protected against attribute inference attacks by adding noise and Siamese fine-tuning. However, their proposed framework is not feasible during training due to its interactive training procedure and high communication throughput between the clients and servers [13].
Later, Li et al. proposed PrivyNet, a private deep learning training framework [13]. PrivyNet splits a neural network into local and cloud counterparts. The feature representations of private data are extracted using the local model while the cloud neural network is trained on publicly released features for the target classification task. The authors considered a reconstruction attack on the shared features and measured privacy through the reconstruction error. In ref. [14], the authors used an adversarial training scheme between an encoder and a classifier to preserve the privacy of intermediate encoded features from attribute inference attacks.
Along the same line of research, Lie et al. [15] introduced an adversarial privacy network called PAN to learn obfuscated features. The learned that obfuscated features are designed to be effective against both reconstruction attacks and attribute inference attacks. Similarly, DeepObfuscater was introduced in ref. [16], and the authors extended PAN to include perceptual quality.
In the context of privacy of published datasets, Huang et al. [17] proposed a framework based on a minmax game between a privatizer and an adversary. By employing generative adversarial networks (GAN) in their framework, users can directly learn to privatize their dataset without having access to the dataset statistics.

Defense Against a Reconstruction Attack
This section introduces a framework to maintain a good classification accuracy while avoiding the invertibility of shared representations. In other words, the proposed framework is designed to keep only relevant information for the specific classification task. The model consists of three modules: encoder, obfuscator, and classifier. The encoder is trained using supervised contrastive loss to provide maximal discrimination for the classification task. The encoded features are obfuscated by minimizing their statistical correlation to the original input images. Finally, a classifier is jointly trained to maintain the classification performance.

Proposed Architecture
The overall private data-sharing framework, shown in Figure 2, consists of three steps: 1.
An encoder f φ is pre-trained on the public data using supervised contrastive loss. The encoder is later used to extract discriminative representation for the targeted classification task; 2.
An obfuscator f ψ is learned to remove irrelevant information in representation h by minimizing its correlation to the original data x; 3.
A classifier g θ is jointly trained with the obfuscator to ensure that the useful information for the intended classification task is preserved in the obfuscated representation.

Encoder
As shown in Figure 3, the encoder f φ is initially trained with a contrastive loss to output a well-discriminated feature representation. To this end, we used a ResNet backbone with contrastive loss similar to the SimCLR approach [18].
The basic idea behind contrastive learning is to pull similar instances denoted as positive pairs together and push dissimilar ones, negative samples, apart. Given a random augmentation transform T t (.), two different views x i , x j of the same image x are considered as positive pairs, and the rest of the batch samples as negative pairs. A projection head g θ (.) maps the feature representations of the base encoder to the latent embedding z [18]: Using cosine similarity, the similarities between positive pairs are maximized while the negative ones are minimized. The self-supervised contrastive loss is defined as: This idea was further extended to include target class information in the loss where feature representations from the same class are pulled closer together than those from different classes [19].
where P(i) are all the positive samples belonging to the same class as x i .

Obfuscator
The obfuscator f ψ is trained to avoid the invertibility of shared feature representation. From an information-theoretic point of view, X → H →X forms a Markov chain. To mitigate the reconstruction attack, I(X,X) should be minimized. A widely used approach is to jointly train an adversary image decoder to achieve reconstruction disparity by minimizing the Structural Similarity Index Measure (SSIM) [20]. This is done through a min-max optimization game between the obfuscator and adversary decoder.
Nevertheless, considering the information processing inequality based on the above Markov chain, minimizing the mutual information between the original image X and the feature representation H upper bounds the I(X,X) as I(X, H) ≥ I(X,X).
To minimize I(X, H), one should estimate the mutual information, which is a wellknown and challenging problem and would involve a more complicated optimization. To solve this issue and to accelerate and simplify the training, we adopted two statistical correlation measures between random variables, namely, Hilbert-Schmidt Independence Criterion (HSIC) [21,22] and Distance Correlation (DistCorr) [23]. Consequently, the obfuscator network f ψ is trained to minimize the correlation between the original images and the protected representation: where Corr(.) can be either based on distance correlation DistCorr or Hilbert-Schmidt Independence Criterion HSIC. The idea of minimizing the statistical dependencies of features has been around in the literature of federated or distributed learning and physics [24][25][26]. Hilbert-Schmidt Independence Criterion (HSIC): Let F be a reproducing kernel Hilbert space (RKHS), with the continuous feature mapping φ(x) and kernel function k(x, x ) = φ(x), φ(x ) . Similarly, assume G be an RKHS, with the continuous feature mapping ψ(h) and kernel function k(h, h ) = ψ(h), ψ(h ) .
The cross-covariance operator C xh : G → F can be defined as [21,22]: where ⊗ is the matrix product and µ The largest singular value of the cross-covariance operator C xh is zero if and only if x and h are independent The Hilbert-Schmidt Independence Criterion is defined as the squared Hilbert-Schmidt norm of the associated cross-covariance operator C xh : Distance Correlation (DistCorr): Let X and H be two random vectors with finite second moments. Assume that (X, H), (X , H ), (X , H ) are independent and identically distributed. Then, the distance covariance can be defined as: where |.| is the pairwise distance. Subsequently, the definition of the distance correlation will be: The classifier g θ is a lightweight neural network with two fully connected layers and Relu activation functions. The classifier is jointly trained with the obfuscator to maintain the classification accuracy for the utility task: where γ is the utility-privacy trade-off parameter. L CE denotes the cross-entropy between the utility attribute y t and its estimateŷ t and L Corr denotes either DistCorr or HSIC according to Equations (6) and (8).

Experimental Setup
Dataset: We conducted experiments on a celebrity face image dataset, CelebA [27], which consists of over 20,000 celebrity images, where each image is annotated with 40 attributes. Every input image is center-cropped by 178 × 178 and then resized to 128 × 128. We select the "gender" attribute for our intended classification task.
Attacker setup: The adversary has a set of publicly available protected representations h p with the corresponding original images x and aims to train a decoder to reconstruct the original image for the model inversion attack.

Visualizations of Encoded Features
This section investigates the effect of using supervised contrastive loss in the encoded features. To do so, we visualized the 2D t-SNEs of extracted features for the target class label of "gender," as depicted in Figure 4. As expected, the output features of the encoder trained with supervised loss are more discriminative compared to those trained in the unsupervised way.

Classification Performance
In this section, the utility-privacy trade-off is investigated in the form of classification accuracy vs. decorrelation. More specifically, we are interested in analyzing the extent to which classification accuracy decreases if we decorrelate the features from original images. As shown in Table 1, with only 0.2 loss in the accuracy, the correlation between input images and the features drops for both similarity measures. In the case of HSIC, however, the reduction in correlation is remarkable. The considerably smaller loss in the accuracy is mainly due to the supervised contrastive loss used in training the encoder, as we obtain discriminative features with respect to the target class. In Section 4.2.4, we demonstrate that an attacker can still reconstruct completely recognizable images using these discriminative features. Consequently, the obfuscator aims at removing all the redundant information about the images and only keeping the ones related to the intended classification task. According to Figure 5, the adversary model for the reconstruction attack consists of a generator G θ x and a discriminator D θ xx . The generator network maps the protected and obfuscated feature representation h p to the image space, while the discriminator evaluates them. The discriminator network assigns a probability that the image is from the real data distribution rather than the generator distribution. Thus, the discriminator is trained to classify images as being from the training data or reconstructed from the generator: Therefore, the decoder and generator are trained in a min-max optimization problem: To improve the performance of the generator, a perceptual loss similar to SRGAN [28] was also employed. The perceptual loss for the generator network consists of an adversarial loss and a content loss: and: where vgg 19 (.) is the output of a pre-trained 19-layer VGG network [29].
We conducted experiments on the reconstruction attack for different correlation losses and different values of γ in Equation (9). The performance of the attack model is evaluated using multi-scale structural similarity (MSSIM) [30] and SSIM [20]. To better evaluate the effectiveness of the proposed obfuscation model, the reconstruction quality from the following scenarios has been considered:  The average SSIM and MSSIM for reconstructed images from the protected features and three other scenarios are reported in Table 2. As the SSIM and MSSIM scores were very close for both correlation measures and different values of γ, we only reported the one for DistCorr and γ = 2 in Table 2. The results show that both similarity measures are dropped by a large margin with only a 0.2% loss in accuracy, therefore validating the effectiveness of the obfuscator. Moreover, the visualization of the reconstructed images is illustrated in Figure 6. The reconstructed images from the raw features are completely recognizable, but not very sharp. This is mainly because the encoder is trained with the supervised contrastive loss, where the information about the target class is mostly left in the last layer. On the other hand, the output images become completely unrecognizable with our proposed obfuscator, and even a powerful decoder can only output an average image. To further investigate the effect of correlation measure and γ in Equation (9), the output images for different cases are presented in Figure 7. Even though the attacker outputs an average image for both cases of correlation measures, it is interesting to note that features learned by HSIC produce different average images for males and females. In other words, the gender information is clearly preserved in the protected representation.

Defense Against an Attribute Inference Attack
Herein, our primary focus is to design a framework for defense against attribute inference attacks. The defender attempts to share a representation with relevant information about the target class label, but keeps the sensitive attribute private.
The model consists of four modules: encoder, obfuscator, target classifier, and adversary classifier. The encoder is trained using supervised and private contrastive loss to provide maximal discrimination for the classification task while protecting the private attribute. Furthermore, the encoded features are obfuscated, and the target classifier is jointly trained to maintain the classification performance. Finally, adversarial training is applied between the target classifier and the adversary classifier.

Proposed Architecture
The overall private data-sharing framework, shown in Figure 8, consists of four steps: 1.
An encoder f φ is pre-trained on the public data using supervised and private contrastive loss. The encoder is later used to extract discriminative representation for the targeted classification task; 2.
An obfuscator f ψ is learned to remove relevant information in the representation h about the private attribute; 3.
A target classifier g θ t is jointly trained with the obfuscator to ensure that the useful information for the intended classification task is preserved in the obfuscated representation; 4.
An adversary classifier g θ a is adversely trained to minimize the classification error for the private attribute.

Encoder
As displayed in Figure 3, the encoder f φ is initially trained with supervised and private contrastive loss to output a well-discriminated feature representation and protect the private attributes. As mentioned in the previous section, the key idea behind contrastive loss is to push negative pairs apart and pull positive ones close. In a supervised contrastive loss, the positive pairs are those with the same target labels. Maximal discrimination can thus be achieved with respect to the target class.
This concept can be further extended to preserve the privacy of private attributes by allowing minimal discrimination regarding the sensitive label. In other words, for a supervised and private contrastive loss, we will assume:

•
Positive pairs: Those with the same target label as the anchor image; • Negative pairs: Those with the different target labels and the same private label as the anchor image.
Therefore, for an augmented dataset of D = {(x 1,i , x 1,j , y 1,t , y 1,p ), . . . (x N,i , x N,j , y N,t , y N,p )}, we can define the positive and negative set for each sample x k as: The supervised and private contrastive loss based on SupCon [19] can thus be defined as: where P(i) and N(i) denote positive and negative sets with respect to sample x i . Similar to SupCon [19], Dai et al. introduced a supervised contrastive loss based on Momentum Contrast (MoCo) [31] denoted as UniCon [32]: where s denotes the similarity score and {k − }, {k + } are the subset of negative and positive pairs, respectively. Likewise, we can extend UniCon loss to take into account private and sensitive attributes as:

Obfuscator
The obfuscator f ψ is trained to hide sensitive and private attributes from the shared representation while keeping the relevant information regarding the target class label.

Target Classifier
The classifier g θ t is a lightweight neural network with three fully connected layers and Relu activation functions. The classifier is jointly trained with the obfuscator to maintain the classification accuracy for the target class label: where L CE indicates the cross-entropy between the target attribute y t and its estimateŷ t .

Adversary Classifier
The adversary classifier g θ a plays the role of an attacker attempting to infer private attributes using the eavesdropped features. We simulate a game between the adversary and the defender through an adversarial training procedure. The attacker tries to minimize the classification error for the private attributes as: Meanwhile,the defender aims to degrade the performance of the adversary classifier and minimize the private attribute leakage while maintaining good performance on the target classification task. Hence: where γ is the utility-privacy trade-off parameter. Algorithm 1 delineates the overall steps in our proposed adversarial training procedure.

Experimental Results
This section analyzes the effectiveness of the proposed framework. For the rest of this section, we refer to utility as the classification accuracy on the target class label. Similarly, privacy is defined as the classification performance on the private and sensitive attribute.

Experimental Setup
Dataset: We conducted experiments on a celebrity face image dataset, CelebA [27], which consists of over 20,000 celebrity images, where each image is annotated with 40 attributes. Every input image is center-cropped by 178 × 178 and then resized to 128 × 128. We select the "gender" attribute for our intended classification task and "age" with two classes of young and old as the sensitive attribute.
Attacker setup: The adversary has a set of publicly available protected representations h p with the corresponding original images x and their protected labels y p and aims to train a classifier to re-identify the protected attribute.
Defender setup: The primary goal of the defender is two-fold: the defender aims to preserve the high accuracy of classification expressed by "target accuracy" with respect to the utility attribute y t . At the same time, the defender wishes to decrease the correct classification accuracy on the attacker's side, which is represented by "private accuracy" with respect to the protected attribute y p . The privacy utility trade-off is controlled by different values of γ in Equation (20). This trade-off is best achieved when, firstly, the publicly available representation h p is discriminative with respect to the target attribute. Secondly, there needs to be an obfuscation mechanism to remove relevant information in h p regarding the private attribute.

Impact of the Obfuscator
In this section, we investigate the impact of the obfuscator. Therefore, keeping the encoder constant, we design an attribute inference attack to classify the private and sensitive attribute with and without the obfuscator. To analyze the privacy trade-off, we experimented with different values of γ in Equation (20), and the results are reported in Table 3.
As shown in Table 3, the classification accuracy significantly drops when the obfuscation is applied, thus validating the effectiveness of the obfuscator module. The obtained results show that the decline in utility is significantly small with only a 0.3-0.7% decrease in target accuracy. Moreover, the increase in γ decreases the private classification accuracy. However, in view of privacy protection, random guessing is the ultimate goal in a binary classification setting, as the adversary can flip his guess for any accuracy lower than the random guessing threshold. In order to account for this, the flipping accuracies are also reported in the last row of Table 3 accordingly. For the CelebA dataset, the class label "age" is slightly imbalanced and distributed as 75-25%; thereby, the corresponding random guessing threshold is 62.5% (0.75 × 0.75 + 0.25 × 0.25 = 0.625). Therefore, from a privacy protection point of view, the best result is obtained for γ = 1 for UniCon loss.

Privacy-Utility Trade-Off Comparison
To better evaluate the effectiveness of the proposed framework model, the privacyutility trade-off for different scenarios has been investigated. The results in Table 3 validate the effectiveness of the obfuscator module. Putting the obfuscator aside, we are interested in analyzing the impact of using supervised and private loss compared to the conventional contrastive loss in Equation (2). To evaluate that, we considered the following scenarios: The privacy-utility tradeoff in the form of target and private accuracy for various settings is reported in Table 4. The final accuracies were flipped in cases lower than the random guessing threshold for a fair comparison.
Impact of supervised and private contrastive loss: As reported in Table 4, the accuracy on the target class is higher for both cases of SupCon and Unicon compared to the unsupervised contrastive loss. This is mainly due to the fact that there was no label information used in the conventional contrastive loss (Equation (2)). In addition, the accuracy on the private attribute is 4% lower in h private−supcon and h private−supcon compared to h, showing benefits of using supervised and private loss.
Impact of adding noise: Adding noise to raw images or extracted features can be considered as a defense mechanism. Injecting Gaussian noise into the data has been widely used in federated learning [33,34]. Indeed, the results in Table 4 demonstrate that the privacy increases as we add noise to the images or the features. Moreover, raising the variance of the noise leads to more privacy gain. However, the private classification accuracies for noisy data are still far from the results we can achieve using the proposed framework. Besides, by adding noise, we also lose utility as the target accuracy drops.
Comparison to DeepObfuscator [16]: We carefully explored and examined other papers in state-of-the-art for a fair comparison. Unfortunately, the differences in the problem formulation make this comparison difficult and unfair in some cases. For example, several works have studied the privacy leakage of a face verification system different from the attribute classification problem formulation. In ref. [35], the authors proposed an adversarial framework for reducing gender information in the final embedding vectors used for the verification system. Hence, we can argue that even though the privacy task of attribute leakage in the embeddings is the same, the utility is defined differently, thereby making the comparison infeasible.
Moreover, several studies have investigated the same utility-privacy formulation as our proposed framework. However, they differ in their overall setting. For example, Boutet et al. [36] proposed a privacy-preserving framework against attribute inference attacks in a federated learning setting. In their experiments, the main target label is "smiling," while the protected label is the "gender" of users.
Nevertheless, a very similar problem formulation and setting are studied in ref. [16]. Li et al. [16] exploit an adversarial game to maintain the classification performance on the public class label while preserving against an attribute-inference attack. As they have used different attributes as the target and private, we re-run their obfuscator model for our public and private attributes. The DeepObfuscator model in [16] is further adapted to only consider the attribute inference attack. The results reported in Table 4 demonstrate the superior performance of the proposed method compared to DeepObfucator.

Conclusions
This paper addressed the problem of template protection against the most commonly used attacks, namely, reconstruction and attribute inference attacks. Two defense frameworks based on contrastive learning were proposed.
For defense against the reconstruction attack, we directly minimize the correlation and dependencies of encoded features with the original data, avoiding the unnecessary complications of a min-max adversarial training. Furthermore, training an encoder with the supervised contrastive loss would minimize discrimination in the feature space and remove redundant information about the original images. Hence, there is no substantial loss in classification performance, and the proposed framework provides a better utility-privacy trade-off.
In the attribute inference attack, the adversary wishes to access the private attribute given the shared protected templates. Therefore, in the first defense step, we propose an encoder trained with the supervised and private contrastive loss. Furthermore, an obfuscator module is trained in an adversarial manner to preserve the privacy of private attributes while maintaining a good classification performance. The reported results on the CelebA dataset validate the effectiveness of the proposed framework. The future work aims at designing a framework based on contrastive loss considering both reconstruction and attribute inference attacks. Another interesting avenue of research is to investigate the performance of the proposed framework on other datasets.