Abstract
Federated representation learning (FRL) is a promising technique for learning shared data representations that capture general features across decentralized clients without sharing raw data. However, there is a risk of sensitive information leakage from learned representations. The conventional differential privacy (DP) mechanism protects the privacy of the whole data by randomizing (adding noise or random response) at the cost of deteriorating learning performance. Inspired by the fact that some data information may be public or non-private and only sensitive information (e.g., race) should be protected, we investigate the information-theoretic protection on specific sensitive information for FRL. To characterize the trade-off between utility and sensitive information leakage, we adopt mutual information-based metrics to measure utility and sensitive information leakage, and propose a method that maximizes the utility performance, while restricting sensitive information leakage less than any positive value via the local DP mechanism. Simulation demonstrates that our scheme can achieve the best utility–leakage trade-off among baseline schemes, and more importantly can adjust the trade-off between leakage and utility by controlling the noise level in local DP.