Model- vs. Data-Parallelism for Training of Deep Neural Networks

Model- vs. Data-Parallelism for Training of Deep Neural Networks

Internship Description

​Training of very large Deep Neural Networks is typically performed on large-scale distributed systems using the so-called data-parallelism approach. However, the scalability of this approach is limited by the convergence properties of the training algorithms. In this project, we will
study a less common approach, called model-parallelism, which has the potential to overcome the convergence limitations. We will deploy and evaluate experimentally the two approaches, in order to understand the trade-offs. We will then design a hybrid method that will attempt to combine the benefits of the existing approaches.​
​​​​

Deliverables/Expectations

1. Experimental evaluation of model- versus data-parallelism.

2. Design and implementation of a hybrid approach.​

Faculty Name

Panos Kalnis

Field of Study

​Computer Science / Machine Learning