Date: January 28th, 12:00pm-1:00pm ET
 
Presenters: Collin Wilson (SHARCNET)
 

In our last talk on Fully Sharded Data Parallel (FSDP), we offered insight into training large models using FSDP and strategies for customizing model training with FSDP for performance benefits.

PyTorch has an updated interface for Fully Sharded Data Parallel called FSDP2, here we will present how to implement FSDP2 in your training code, compare FSDP2 with FSDP, and examine training performance using FSDP2 on the new systems. Intermediate experience with Python, PyTorch and deep learning is expected.

Date: January 14th, 12:00pm-1:00pm ET
 
Presenters: Kayhan Momeni (Dept. of Physics, U. of Toronto
 
This session will describe our early experiences using the Trillium supercomputer to develop a next-generation Massachusetts Institute of Technology general circulation model (MITgcm) global ocean simulation. Our project will culminate in a run of the MITgcm with horizontal grid spacing of 1/96° (~1 km). It will be the highest-resolution realistic ocean model produced to date.
This simulation will contain several advances relative to the widely used 1/48° MITgcm simulation (also known as LLC4320), including increased vertical and horizontal resolution, an updated global bathymetry, the use of a more accurate surface pressure solver, the addition of ice-shelf cavities around Greenland and Antarctica, hourly atmospheric forcing, realistic river discharge, and more accurate astronomical tides. These improvements directly address long-standing issues in earlier high-resolution MITgcm simulations, for example, a misplaced Gulf Stream, a crude representation of Antarctic shelf currents, and anemic tropical instability waves.
The resulting model output will offer an unprecedented benchmark for studies of internal tides and internal waves, turbulence parameterization, and sea-surface height variability. All configurations, tools, and outputs will be openly released, positioning this Canada-led effort as a major global resource for oceanography and climate modelling.
Date: December 17th, 12:00pm-1:00pm ET
 
Presenters: Weiguang Guan (SHARCNET, Alliance)
 
Modern AI models, especially deep neural networks, have achieved remarkable success across vision, language, and decision-making tasks — but their inner workings often remain opaque, earning them the label of “black boxes”. This lack of interpretability raises challenges in trust, accountability, and model debugging. In this talk, we explore Integrated Gradients, a principled method for attributing a model’s prediction to its input features. By integrating gradients of the model’s output with respect to its inputs along a path from a baseline to the actual input, this technique provides a mathematically grounded way to identify which features most influence the outcome. We will discuss the theoretical foundations of integrated gradients, their advantages over simpler attribution methods, and practical examples that illustrate how they reveal meaningful insights about model behavior. 
Date: October 29th 12:00pm-1:00pm
 
Presenters: Irfaan Cader (Senior Information Risk and Security Professional, The Hospital for Sick Children)
 
As research continues to generate and depend on vast volumes of data, the boundaries between healthcare, computation, and innovation continue to blur. High-performance computing, cloud environments, and federated data platforms are transforming how research is conducted, yet the frameworks that govern cybersecurity, privacy, and data management often struggle to keep pace with this digital acceleration.
 
In this session, Irfaan, Senior Information Risk Analyst at the SickKids Research Institute, introduces how cybersecurity and information risk management principles can strengthen trust, integrity, and collaboration across the research lifecycle. Drawing on experience at the intersection of clinical research, IT compliance, and information risk, he will outline how standards such as those from NIST and policy instruments like Canada’s Tri-Agency Research Data Management Policy influence how institutions think about protecting data, enabling collaboration, and supporting responsible research.
 
Attendees will learn that effective security in research is not only about firewalls and encryption, but also about building a culture of responsible innovation that safeguards data, protects patients, and sustains discovery across Ontario’s research ecosystem.

Date: October 22nd 12:00pm-1:00pm

Contributors: Sahar Naseer (Privacy Specialist, The Hospital for Sick Children), Roohie Sharma (Legal Privacy Counsel, The Hospital for Sick Children), Melissa Lanuza (Privacy and FOI, The Hospital for Sick Children)

This presentation provides a foundational overview of health privacy principles and legislation relevant to Ontario hospitals. It introduces the ten privacy principles that underpin Canadian privacy laws, with a focus on the Personal Health Information Protection Act (PHIPA), the Personal Information Protection and Electronic Documents Act (PIPEDA), and the Freedom of Information and Protection of Privacy Act (FIPPA). The session covers the rights and responsibilities associated with personal health information (PHI) and personal information (PI), the process for handling freedom of information requests, and the importance of cybersecurity awareness.

Date: October 15th 12:00pm-1:00pm

Presenters: Kristi Thompson (Research Data Management Librarian, Western University) and Alexandra Cooper (Data Services Coordinator, Queen’s University)

In early 2025 data and documents began to disappear from U.S. government web sites in response to a series of executive orders. This led to a scramble as various individuals and groups mobilized to save as much disappearing data as they could. These events served as a wake-up call and led to the founding of the Canadian Public Data Rescue Initiative (CPDRI). 

Initially formed, in part, to support rescue efforts in the U.S., the CPDRI also set out to build infrastructure to support public data in Canada. Drawing on projects including the Canadian Government Information Digital Preservation Network and the OCUL Ontario Data Rescue Group, the CPDRI is working to establish a sustainable strategy for preservation of vital Canadian public datasets.