Section outline

  • Date: January 28th, 12:00pm-1:00pm ET
     
    Presenters: Collin Wilson (SHARCNET)
     

    In our last talk on Fully Sharded Data Parallel (FSDP), we offered insight into training large models using FSDP and strategies for customizing model training with FSDP for performance benefits.

    PyTorch has an updated interface for Fully Sharded Data Parallel called FSDP2, here we will present how to implement FSDP2 in your training code, compare FSDP2 with FSDP, and examine training performance using FSDP2 on the new systems. Intermediate experience with Python, PyTorch and deep learning is expected.

    Video link:  Zoom link.