Home / Papers / Communication-efficient federated learning

Communication-efficient federated learning

DOI: 10.1073/pnas.2024789118Semantic Scholar

221 Citations•2021•

Mingzhe Chen, Nir Shlezinger, H. Poor

Proceedings of the National Academy of Sciences

A communication-efficient FL framework is proposed to jointly improve the FL convergence time and the training loss, and a probabilistic device selection scheme is designed such that the devices that can significantly improve the convergence speed and training loss have higher probabilities of being selected for ML model transmission.

Abstract

Significance Federated learning (FL) is an emerging paradigm that enables multiple devices to collaborate in training machine learning (ML) models without having to share their possibly private data. FL requires a multitude of devices to frequently exchange their learned model updates, thus introducing significant communication overhead, which imposes a major challenge in FL over realistic networks that are limited in computational and communication resources. In this article, we propose a communication-efficient FL framework that enables edge devices to efficiently train and transmit model parameters, thus significantly improving FL performance and convergence speed. Our proposed FL framework paves the way to collaborative ML in large-scale networking systems such as Internet of Things networks. Federated learning (FL) enables edge devices, such as Internet of Things devices (e.g., sensors), servers, and institutions (e.g., hospitals), to collaboratively train a machine learning (ML) model without sharing their private data. FL requires devices to exchange their ML parameters iteratively, and thus the time it requires to jointly learn a reliable model depends not only on the number of training steps but also on the ML parameter transmission time per step. In practice, FL parameter transmissions are often carried out by a multitude of participating devices over resource-limited communication networks, for example, wireless networks with limited bandwidth and power. Therefore, the repeated FL parameter transmission from edge devices induces a notable delay, which can be larger than the ML model training time by orders of magnitude. Hence, communication delay constitutes a major bottleneck in FL. Here, a communication-efficient FL framework is proposed to jointly improve the FL convergence time and the training loss. In this framework, a probabilistic device selection scheme is designed such that the devices that can significantly improve the convergence speed and training loss have higher probabilities of being selected for ML model transmission. To further reduce the FL convergence time, a quantization method is proposed to reduce the volume of the model parameters exchanged among devices, and an efficient wireless resource allocation scheme is developed. Simulation results show that the proposed FL framework can improve the identification accuracy and convergence time by up to 3.6% and 87% compared to standard FL.