Next Article in Journal
Multivariate and Multiscale Complexity of Long-Range Correlated Cardiovascular and Respiratory Variability Series
Previous Article in Journal
Applications of Information Theory Methods for Evolutionary Optimization of Chemical Computers
Open AccessArticle

Averaging Is Probably Not the Optimum Way of Aggregating Parameters in Federated Learning

1
Department of Computer Science and Technology, Tongji University, Shanghai 201804, China
2
The School of Electrical and Computer Engineering, University of Oklahoma, Tulsa, OK 73019, USA
3
Department of Electronic and Electrical engineering, University of Strathclyde, Glasgow G1 1XW, UK
4
Faculty of Technical Sciences, University of Novi Sad, 21000 Novi Sad, Serbia
*
Author to whom correspondence should be addressed.
Entropy 2020, 22(3), 314; https://doi.org/10.3390/e22030314
Received: 3 March 2020 / Accepted: 9 March 2020 / Published: 11 March 2020
Federated learning is a decentralized topology of deep learning, that trains a shared model through data distributed among each client (like mobile phones, wearable devices), in order to ensure data privacy by avoiding raw data exposed in data center (server). After each client computes a new model parameter by stochastic gradient descent (SGD) based on their own local data, these locally-computed parameters will be aggregated to generate an updated global model. Many current state-of-the-art studies aggregate different client-computed parameters by averaging them, but none theoretically explains why averaging parameters is a good approach. In this paper, we treat each client computed parameter as a random vector because of the stochastic properties of SGD, and estimate mutual information between two client computed parameters at different training phases using two methods in two learning tasks. The results confirm the correlation between different clients and show an increasing trend of mutual information with training iteration. However, when we further compute the distance between client computed parameters, we find that parameters are getting more correlated while not getting closer. This phenomenon suggests that averaging parameters may not be the optimum way of aggregating trained parameters. View Full-Text
Keywords: federated learning; decentralized learning; averaging; mutual information; correlation federated learning; decentralized learning; averaging; mutual information; correlation
Show Figures

Figure 1

MDPI and ACS Style

Xiao, P.; Cheng, S.; Stankovic, V.; Vukobratovic, D. Averaging Is Probably Not the Optimum Way of Aggregating Parameters in Federated Learning. Entropy 2020, 22, 314.

Show more citation formats Show less citations formats
Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Article Access Map by Country/Region

1
Search more from Scilit
 
Search
Back to TopTop