Next Article in Journal
Analyzing the Importance of Sensors for Mode of Transportation Classification
Next Article in Special Issue
FedPSO: Federated Learning Using Particle Swarm Optimization to Reduce Communication Costs
Previous Article in Journal
The Cardiovascular Response to Interval Exercise Is Modified by the Contraction Type and Training in Proportion to Metabolic Stress of Recruited Muscle Groups
 
 
Article

Proactive Congestion Avoidance for Distributed Deep Learning

Department of Computer Science and Engineering, Korea University, 145, Anam-ro, Seongbuk-gu, Seoul 02841, Korea
*
Authors to whom correspondence should be addressed.
Sensors 2021, 21(1), 174; https://doi.org/10.3390/s21010174
Received: 9 November 2020 / Revised: 18 December 2020 / Accepted: 24 December 2020 / Published: 29 December 2020
(This article belongs to the Special Issue AI-Based Communications)
This paper presents “Proactive Congestion Notification” (PCN), a congestion-avoidance technique for distributed deep learning (DDL). DDL is widely used to scale out and accelerate deep neural network training. In DDL, each worker trains a copy of the deep learning model with different training inputs and synchronizes the model gradients at the end of each iteration. However, it is well known that the network communication for synchronizing model parameters is the main bottleneck in DDL. Our key observation is that the DDL architecture makes each worker generate burst traffic every iteration, which causes network congestion and in turn degrades the throughput of DDL traffic. Based on this observation, the key idea behind PCN is to prevent potential congestion by proactively regulating the switch queue length before DDL burst traffic arrives at the switch, which prepares the switches for handling incoming DDL bursts. In our evaluation, PCN improves the throughput of DDL traffic by 72% on average. View Full-Text
Keywords: distributed deep learning; P4; congestion avoidance; deep learning; network congestion; proactive congestion notification distributed deep learning; P4; congestion avoidance; deep learning; network congestion; proactive congestion notification
Show Figures

Figure 1

MDPI and ACS Style

Kang, M.; Yang, G.; Yoo, Y.; Yoo, C. Proactive Congestion Avoidance for Distributed Deep Learning. Sensors 2021, 21, 174. https://doi.org/10.3390/s21010174

AMA Style

Kang M, Yang G, Yoo Y, Yoo C. Proactive Congestion Avoidance for Distributed Deep Learning. Sensors. 2021; 21(1):174. https://doi.org/10.3390/s21010174

Chicago/Turabian Style

Kang, Minkoo, Gyeongsik Yang, Yeonho Yoo, and Chuck Yoo. 2021. "Proactive Congestion Avoidance for Distributed Deep Learning" Sensors 21, no. 1: 174. https://doi.org/10.3390/s21010174

Find Other Styles
Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Article Access Map by Country/Region

1
Back to TopTop