- Article
Pipelined Stochastic Gradient Descent with Taylor Expansion
- Bongwon Jang,
- Inchul Yoo and
- Dongsuk Yook
Stochastic gradient descent (SGD) is an optimization method typically used in deep learning to train deep neural network (DNN) models. In recent studies for DNN training, pipeline parallelism, a type of model parallelism, is proposed to accelerate SG...