
Description: Larger Deep Neural Networks (DNNs) are typically more powerful, but training models across multiple GPUs or multiple nodes isn’t trivial and requires a an understanding of both AI and high-performance computing (HPC). In this workshop we will give an overview of activation checkpointing, gradient accumulation, and various forms of data and model parallelism to overcome the challenges associated with large-model memory footprint, and walk through some examples.
Teacher: Jonathan Dursi (NVIDIA)
Level: Intermediate/Advanced
Format: Lecture + Demo
Certificate: Attendance
Prerequisites:
- Familiarity with training models in Pytorch on a single GPU will be assumed.