Date: March 25th, 12:00pm-1:00pm ET
 
Presenters: Jinhui Qin (SHARCNET)
 

Reproducibility and experiment tracking are essential in machine learning workflows. MLflow is an open-source platform for experiment tracking and model management in machine learning and AI development. This webinar introduces MLflow with quickstart examples running on the clusters, focusing on a lightweight setup with local storage. The examples will be demonstrated in Jupyter notebooks and in batch jobs. 

Date: March 11th, 12:00pm-1:00pm ET
 
Presenters: Dr. Rakesh Raghavaraju CAC
 

Similar to disk space, inodes (the number of files on a filesystem) are a limited resource. Therefore, each user and group are allocated a fixed number of inodes, by default. In this webinar, filesystem quotas on the Alliance clusters and best practices for managing file quotas will be presented, including the use of archival storage. Inconsistencies in file ownership leading to “disk quota exceeded” errors will be discussed. Finally, file formats such as NetCDF and SQL BLOBs for effectively storing large sets of small files will be presented.

Date: February 25th, 12:00pm-1:00pm ET
 
Presenters: Ed Armstrong (SHARCNET)
 

Learn how to create and manage virtual machines on SHARCNet's cloud infrastructure using OpenStack. This session covers the essentials of working with the OpenStack dashboard to launch VMs, configure security groups, manage storage volumes, and control your cloud resources. Whether you're setting up a web server, running custom software environments, or building virtual clusters, you'll discover how OpenStack gives you complete control over your computing environment.

Date: February 11th, 12:00pm-1:00pm ET
 
Presenters: Robin Haw, Ph.D., Senior Program Manager - Computational Biology and Genome Informatics, OICR and Neha Ratti, Ontario Regional Coordinator, Canadian Bioinformatics Hub
 

The Canadian Bioinformatics Hub (CBH) supports the growth of bioinformatics and computational biology in Canada through training, mentorship, and community-building initiatives. This presentation introduces CBH and the resources it offers to students, early-career researchers, and industry professionals. We will highlight key programs within the training and community pillars and provide an overview of bioinformatics user groups across Canada, with a focus on Ontario. Join us to learn how CBH connects people, builds skills, and strengthens bioinformatics communities across the province and beyond.

Date: January 28th, 12:00pm-1:00pm ET
 
Presenters: Collin Wilson (SHARCNET)
 

In our last talk on Fully Sharded Data Parallel (FSDP), we offered insight into training large models using FSDP and strategies for customizing model training with FSDP for performance benefits.

PyTorch has an updated interface for Fully Sharded Data Parallel called FSDP2, here we will present how to implement FSDP2 in your training code, compare FSDP2 with FSDP, and examine training performance using FSDP2 on the new systems. Intermediate experience with Python, PyTorch and deep learning is expected.

Date: January 14th, 12:00pm-1:00pm ET
 
Presenters: Kayhan Momeni (Dept. of Physics, U. of Toronto
 
This session will describe our early experiences using the Trillium supercomputer to develop a next-generation Massachusetts Institute of Technology general circulation model (MITgcm) global ocean simulation. Our project will culminate in a run of the MITgcm with horizontal grid spacing of 1/96° (~1 km). It will be the highest-resolution realistic ocean model produced to date.
This simulation will contain several advances relative to the widely used 1/48° MITgcm simulation (also known as LLC4320), including increased vertical and horizontal resolution, an updated global bathymetry, the use of a more accurate surface pressure solver, the addition of ice-shelf cavities around Greenland and Antarctica, hourly atmospheric forcing, realistic river discharge, and more accurate astronomical tides. These improvements directly address long-standing issues in earlier high-resolution MITgcm simulations, for example, a misplaced Gulf Stream, a crude representation of Antarctic shelf currents, and anemic tropical instability waves.
The resulting model output will offer an unprecedented benchmark for studies of internal tides and internal waves, turbulence parameterization, and sea-surface height variability. All configurations, tools, and outputs will be openly released, positioning this Canada-led effort as a major global resource for oceanography and climate modelling.
Date: December 17th, 12:00pm-1:00pm ET
 
Presenters: Weiguang Guan (SHARCNET, Alliance)
 
Modern AI models, especially deep neural networks, have achieved remarkable success across vision, language, and decision-making tasks — but their inner workings often remain opaque, earning them the label of “black boxes”. This lack of interpretability raises challenges in trust, accountability, and model debugging. In this talk, we explore Integrated Gradients, a principled method for attributing a model’s prediction to its input features. By integrating gradients of the model’s output with respect to its inputs along a path from a baseline to the actual input, this technique provides a mathematically grounded way to identify which features most influence the outcome. We will discuss the theoretical foundations of integrated gradients, their advantages over simpler attribution methods, and practical examples that illustrate how they reveal meaningful insights about model behavior. 
Date: October 29th 12:00pm-1:00pm
 
Presenters: Irfaan Cader (Senior Information Risk and Security Professional, The Hospital for Sick Children)
 
As research continues to generate and depend on vast volumes of data, the boundaries between healthcare, computation, and innovation continue to blur. High-performance computing, cloud environments, and federated data platforms are transforming how research is conducted, yet the frameworks that govern cybersecurity, privacy, and data management often struggle to keep pace with this digital acceleration.
 
In this session, Irfaan, Senior Information Risk Analyst at the SickKids Research Institute, introduces how cybersecurity and information risk management principles can strengthen trust, integrity, and collaboration across the research lifecycle. Drawing on experience at the intersection of clinical research, IT compliance, and information risk, he will outline how standards such as those from NIST and policy instruments like Canada’s Tri-Agency Research Data Management Policy influence how institutions think about protecting data, enabling collaboration, and supporting responsible research.
 
Attendees will learn that effective security in research is not only about firewalls and encryption, but also about building a culture of responsible innovation that safeguards data, protects patients, and sustains discovery across Ontario’s research ecosystem.

Date: October 22nd 12:00pm-1:00pm

Contributors: Sahar Naseer (Privacy Specialist, The Hospital for Sick Children), Roohie Sharma (Legal Privacy Counsel, The Hospital for Sick Children), Melissa Lanuza (Privacy and FOI, The Hospital for Sick Children)

This presentation provides a foundational overview of health privacy principles and legislation relevant to Ontario hospitals. It introduces the ten privacy principles that underpin Canadian privacy laws, with a focus on the Personal Health Information Protection Act (PHIPA), the Personal Information Protection and Electronic Documents Act (PIPEDA), and the Freedom of Information and Protection of Privacy Act (FIPPA). The session covers the rights and responsibilities associated with personal health information (PHI) and personal information (PI), the process for handling freedom of information requests, and the importance of cybersecurity awareness.

Date: October 15th 12:00pm-1:00pm

Presenters: Kristi Thompson (Research Data Management Librarian, Western University) and Alexandra Cooper (Data Services Coordinator, Queen’s University)

In early 2025 data and documents began to disappear from U.S. government web sites in response to a series of executive orders. This led to a scramble as various individuals and groups mobilized to save as much disappearing data as they could. These events served as a wake-up call and led to the founding of the Canadian Public Data Rescue Initiative (CPDRI). 

Initially formed, in part, to support rescue efforts in the U.S., the CPDRI also set out to build infrastructure to support public data in Canada. Drawing on projects including the Canadian Government Information Digital Preservation Network and the OCUL Ontario Data Rescue Group, the CPDRI is working to establish a sustainable strategy for preservation of vital Canadian public datasets.