Level up your computational research!

Discover powerful digital research techniques at the 2026 Compute Ontario Summer School


NOTE: Registration does NOT start until May 11, 2026 at 1 PM EDT.


Event Description

Taking place from June 1 to June 19, the Compute Ontario Summer School offers a comprehensive curriculum packed with over 40 courses. Delivered by experts in the field, these sessions cover a wide range of topics including Advanced Research Computing (ARC), High Performance Computing (HPC), Research Data Management (RDM), and Research Software (RS). With presentations and workshops available at introductory to intermediate levels, there is something for everyone.

Presented by ACENET, CAC, SciNet, and SHARCNET, in partnership with Bioinformatics.ca, Digital Research Alliance of Canada, HPC4Health, Ontario Brain Institute, OICR, RDM Network of Experts, and Scholars Portal.

Enrolment

In order to enrol in COSS 2026 courses:

  1. If you don't already have a Compute Ontario Training account, create one.
  2. Log in with your Compute Ontario Training account (this link opens in a new window).
  3. Enrol in a desired course below by clicking its Link to Course and then click on the Enrol button.

Participants need to enrol in each course they wish to attend. Be aware that some courses overlap, so check the schedule below carefully.

We also have a frequently asked questions (FAQ) page for this event.

You can find more details about our summer school in this video: What is Compute Ontario Summer School and how do I attend?

Registration Form

Once you enroled in one or more courses, you can fill out the Registration Form. Please be aware that filling out the registration form is a requirement for obtaining attendance certificates for our courses.

Week 1
When (EDT) Course
:: Mon., June 1 ::
9:00 AM to 12:00 PM EDT

Bioinformatics: Analysis of RNA-sequencing Data

:: Link to Course :: n/a ::

Description: RNA-Seq refers to high throughput sequencing methods that probes the entire transcriptomic landscape of a given tissue or sample of interest. The data acquired from such experiments can be used to explore the overall RNA profile of a sample as well as comparing samples under various conditions. While extremely powerful, RNA-Seq is susceptible to numerous experimental pitfalls and requires intimate knowledge of the experimental procedures and data analysis methods. When conducted properly RNA-Seq can reveal information about gene/transcript expression, splicing and the effects of mutations. In this session we will take a thorough look at a comprehensive RNA-Seq pipeline, from sample processing methods to final differential expression analysis. Relevant R / BioConductor packages will be introduced. We will have the opportunity to investigate numerous quality control metrics, perform genomic alignment, differential expression and pathway enrichment analysis. We will cover several "gotcha"s and common mistakes in experimental design and data analysis. Basic familiarity with R and Linux command line will be beneficial but not required. All necessary commands and parameters will be explained during the class. Participants will be offered hands-on practice in which they will use RStudio to run R/BioConductor scripts for data analysis as well as the Integrative Genomic Viewer (IGV) software to visualize genomic data on their laptops

Teachers: Alper Celik (HPC4Health, Centre for Computational Medicine SickKids) and Lauren Liang (HPC4Health, Centre for Computational Medicine SickKids)

Level: Intermediate

Format: Lecture + Hands-on

Certificate: Attendance

Prerequisites: Basic R and Linux beneficial but not required

:: Mon., June 1 ::
9:00 AM to 10:25 AM EDT

Overview of training opportunity in the School and beyond

:: Link to Course :: n/a ::

Description: Are you not sure which workshops to sign up for in this Summer School? In this session, we will give an overview of the program of the Compute Ontario Summer School to help you decide. We'll also show you what other training opportunity in Advanced Research Computing and Research Data Management are available for you in Canada after the summer school.

Teacher: Ramses van Zon (SciNet, University of Toronto)

Level: Introductory

Format: Webinar

Certificate: Attendance

Prerequisites: None

:: Mon., June 1 ::
10:35 AM to 12:00 PM EDT

Interactive Computing with Open OnDemand

:: Link to Course :: n/a ::

Description: Accessing High Performance Computing (HPC) resources via terminal-based interfaces can be quite challenging for new users with limited experience, resulting in a steep learning curve. Open OnDemand aims to make HPC more accessible by offering an intuitive graphical interface that simplifies the process of submitting, monitoring, and managing jobs. In this course we will explore the key features of Trillium & Nibi's Open OnDemand portal, including web-based access, job management, file management and support for
interactive applications like Jupyter Notebooks, RStudio, and VS Code.

Teachers: James Willis (SciNet, University of Toronto), Tyson Whitehead (SHARCNET, University of Western Ontario)

Level: Introductory

Format: Lecture+Hands-on

Certificate: Attendance

Prerequisites: None

:: Mon., June 1 ::
1:30 PM to 4:30 PM EDT

AI Showcase

:: Link to Course :: n/a ::

Description: This course introduces Artificial Intelligence (AI), a science focusing on developing intelligent systems capable of autonomous behavior. In this course, we explore the exciting world of AI, introducing its definition and history. We discuss the advantages and challenges of AI at present, along with various applications and projects that demonstrate its capabilities. Throughout the session, participants will gain insights into different types of AI, learn about running predefined projects, and discover AI showcases on various platforms. By the end of the course, participants will have the knowledge and resources to start their own AI projects with their data and explore the latest AI advancements.

Teacher: Nastaran Shahparian (SHARCNET, York University)

Level: Introductory

Format: Lecture

Certificate: Attendance and Completion

Prerequisite: Basic Python knowledge and know-how is beneficial but not required.

:: Mon., June 1 ::
1:30 PM to 4:30 PM EDT

Bioinformatics: Long-read Sequencing Applications

:: Link to Course :: n/a ::

Description: Long-read sequencing technologies enable the sequencing of DNA fragments 10KB and longer. This read length greatly improves sequence mappability and assembly, providing an advantage over short-read sequences that are difficult to map uniquely to repetitive and GC-rich regions. Long-read sequencing has applications in a number of fields, including genome assembly, diagnosis of genetic diseases, and metagenomics. In this workshop, we will focus on PacBio HiFi sequences and introduce you to tools for haplotyping, calling and visualizing structural variants and repeat expansions, visualizing read methylation, and detection of novel isoforms from PacBio Iso-Seq.

Teachers: Madeline Couse (HPC4Health, The Centre for Computational Medicine at SickKids) and Lauren Liang (HPC4Health, Hospital for Sick Children)

Level: Introductory

Format: Lecture + Hands-on

Certificate: Attendance

Prerequisite: Basic knowledge about DNA/RNA sequencing.

:: Tue., June 2 ::
9:00 AM to 12:00 PM EDT
1:30 PM to 4:30 PM EDT

Introduction to Single cell RNA sequencing and analysis

:: Link to Course :: n/a ::

Description: TBA

Teachers: Niu Huilin (Bioinformatics.ca, Western University)

Level: TBA

Format: TBA

Certificate: TBA

Prerequisites:

  • TBA
:: Tue., June 2 ::
9:00 AM to 12:00 PM EDT

Introduction to Advanced Research Computing

:: Link to Course :: n/a ::

Description: This workshop is a primer for those largely new to supercomputing, i.e., to computing on shared, remote resources. It is intended to demystify the somewhat intimidating term "High-Performance Computing" (HPC), and to serve as a foundation upon which to build over the coming days. Topics will include motivation for HPC, available resources, essential issues, and a high level overview of parallel programming models commonly used on these systems.

Teacher: Ramses van Zon (SciNet, University of Toronto)

Level: Introductory

Format: Lecture + Hands-on

Certificates: Attendance and Completion

Prerequisite: Basic Linux

:: Tue., June 2 ::
1:30 PM to 4:30 PM EDT

Profiling AI Software with Pytorch Profiler + Nsight Systems

:: Link to Course :: n/a ::

Description: This lab discusses profiling using NVIDIA Nsight Systems to understand and optimize performance on deep learning training - here we will use a very simple example based on handwritten digits using a PyTorch Modified National Institute of Standards and Technology (MNIST) dataset. The techniques and strategies apply to any optimizing any application that uses NVIDIA GPUs.

NOTE: This course will not be recorded and is limited to 30 persons.

Teachers: Jonathan Dursi (NVIDIA)

Level: Intermediate

Format: Lecture + Hands-on

Certificate: Completion and Attendance

Prerequisites:

  • TBA
:: Wed., June 3 ::
9:00 AM to 12:00 PM EDT
1:30 PM to 4:30 PM EDT

Introduction to Python

:: Link to Course :: n/a ::

Description: This course is designed to provide you with a solid foundation in Python programming language. Through a comprehensive curriculum and hands-on coding exercises, participants will learn the fundamentals of Python syntax, data types, functions, and file handling. By the end of the course, you will have gained the essential skills to write Python programs, solve problems, and build the foundation for more advanced Python development. Whether you are a beginner or have some programming experience, this course will equip you with the necessary tools to start your journey in Python programming.

Teacher: Fernando Hernandez Leiva (CAC: Queen's University)

Level: Introductory

Format: Workshop

Certificate: Attendance

Prerequisite: An account (free) on https://replit.com/. The course is delivered using a free online tool to let us focus on coding.

:: Wed., June 3 ::
9:00 AM to 10:25 AM EDT

Introduction of R Shiny

:: Link to Course :: n/a ::

Description: R Shiny has become one of the most widely used tools in bioinformatics for building interactive data analysis applications-without requiring web development expertise. For researchers, analysts, and students working with high dimensional biological data, Shiny provides a way to turn static analyses into dynamic, user friendly dashboards that make exploration simpler, faster, and more intuitive.

This course is intended to help learners bridge the gap between data analysis and data communication. Instead of generating dozens of plots manually, you can build a Shiny app where users adjust parameters, filter data, and visualize results in real time. This is especially valuable in genomics, where datasets are large, multidimensional, and frequently explored by multidisciplinary teams.

Teachers: Prajkta Kallurkar (HPC4Health, SickKids)

Level: Introductory

Format: Lecture + Hands-On

Certificate: Completion

Prerequisites: Basic R knowledge.

:: Wed., June 3 ::
10:35 AM to 12:00 PM EDT

High Performance Rapid Prototyping in Biological Sequence Analysis

:: Link to Course :: n/a ::

Description: Rapid prototyping for biological sequence analysis requires tools that let researchers move quickly from idea to working workflow without sacrificing performance. Today, this is typically done in one of two ways. The first is by chaining specialized command-line tools, either directly or through workflow systems such as Nextflow or Snakemake. The second is by scripting interactively in high-level languages such as Python, using libraries like BioPython or cogent3 in notebooks. Each approach carries significant performance trade-offs. Pipeline-based tools incur costly memory round-trips - data must be repeatedly transferred off and back onto accelerator memory (e.g., GPU) between chained commands. Python libraries, meanwhile, largely lack built-in support for CPU parallelism (e.g., BioPython) or GPU acceleration (e.g., cogent3), limiting scalability for performance-critical workloads. This course introduces GNX, a modern C++20 library and toolkit purpose-built for high-performance rapid prototyping of biological sequence analysis. GNX addresses the limitations of both paradigms through three core design principles: zero-copy operations that enable toolkit commands to chain directly through device memory without redundant data transfers; SIMD and multi-level parallelism for hardware-aware acceleration on modern CPUs; and seamless backend portability across OpenMP, CUDA, and ROCm targets. Crucially, GNX also exposes an interactive scripting interface through the xeus-cling Jupyter kernel, bringing C++-native performance to the familiar notebook environment. By the end of this 90-minute course, participants will understand GNX's architecture, be able to construct accelerated bioinformatics pipelines using its toolkit and interactively prototype analyses in a Jupyter notebook - all without sacrificing performance.

Teachers: Armin Sobhani (SHARCNET)

Level: Intermediate

Format: Lecture

Certificate: Attendance

Prerequisites: None

:: Wed., June 3 ::
1:30 PM to 4:30 PM EDT

Bioinformatics for Pathway Enrichment Analysis

:: Link to Course :: n/a ::

Description: Pathway enrichment analysis is a powerful computational approach used to identify biological pathways that are significantly overrepresented in a given set of differentially expressed genes, or any gene list derived from -omics data. This method helps to contextualize large gene lists by linking them to known biological processes, functional modules, and disease mechanisms. While highly informative, pathway enrichment analysis requires careful interpretation and an understanding of statistical methodologies, reference databases, and potential biases in gene-set analysis. In this session, we will explore key concepts and methods for pathway enrichment analysis, and we will discuss different enrichment approaches, including over-representation analysis of a defined gene list and gene set enrichment analysis (GSEA). Participants will be offered hands-on practice in which they will use RStudio to run R/BioConductor scripts for pathway enrichment analysis as well as the Cytoscape software to visualize the results of enrichment analysis on their personal computers. Basic familiarity with R will be beneficial.

Teachers: Ruth Isserlin (Bioinformatics.ca, UHN, Toronto) and Veronique Voisin (Bioinformatics.ca, UHN)

Level: Intermediate

Format: Lecture + Hands-on

Certificate: Attendance

Prerequisites:

  • Knowing how to open R or R-Studio and install packages.
  • Basic knowledge of R (recommended).
  • General knowledge of differential expression of RNA-seq or scRNA-seq data.
:: Thu., June 4 ::
9:00 AM to 12:00 PM EDT
1:30 PM to 4:30 PM EDT

Fortran as a Second Language

:: Link to Course :: n/a ::

Description: The original high-level programming language, Fortran continues to be used today for high-performance computing in many fields. It has evolved over the years, and modern Fortran provides implicit parallelism (array expressions), explicit parallelism (coarrays), and object-oriented features, among other things. It supports the MPI, OpenMP, and OpenACC parallel programming standards. The primary aim of this course is to help you understand and modify existing Fortran code, but would also be useful if you wish to start a new project in Fortran. You should have prior experience with some other programming language, but this is otherwise a beginner-level course.

Teachers: Ross Dickson (ACENET, Dalhousie University)

Level: Introductory

Format: Lecture + Hands-on

Certificate: Attendance and Completion

Prerequisite: Prior experience with some other programming language

:: Thu., June 4 ::
9:00 AM to 12:00 PM EDT
1:30 PM to 4:30 PM EDT

Introduction to Linux shell

:: Link to Course :: n/a ::

Description: This course is an introduction to the unix bash shell. It starts at the very basics, but works up to cover reasonably advanced topics like regular expressions and shell scripting. Knowing how to use the shell is an essential part of using the supercomputers and extremely useful for automating your processing.

Teacher: Tyson Whitehead (SHARCNET, Western University)

Level: Introductory

Format: Lecture + Exercises with Questions

Certificate: Attendance and Completion

Prerequisites: None

:: Fri., June 5 ::
9:00 AM to 12:00 PM EDT
1:30 PM to 4:30 PM EDT

Multicore Parallel Programming with OpenMP

:: Link to Course :: n/a ::

Description: This is an introductory to intermediate-level hands-on course on OpenMP. OpenMP is a standard parallel programming API that supports multi-platform shared-memory multiprocessing in C, C++, and Fortran.

This one-day course covers the fundamentals of OpenMP compiler directives, library routines, and environment variables through step-by-step hands-on examples. Case studies will explore different approaches to loop parallelism. We will also discuss task constructs for irregular programs and target constructs for accelerators such as GPUs.

Participants will gain practical programming experience with OpenMP, including how to compile and run multithreaded OpenMP code on various Alliance clusters.

Teacher: Jemmy Hu (SHARCNET, University of Waterloo)

Level: Introductory

Format: Lecture + Hands-on

Certificates: Attendance

Prerequisites: Basic knowledge of C, C++, or Fortran.

:: Fri., June 5 ::
9:00 AM to 10:25 AM EDT

Population genomics in complex agricultural crops: comparing SNP panels and GBS workflows

:: Link to Course :: n/a ::

Description: TBA

Teachers: Tayab Soomro (Bioinformatics.ca, Agriculture and Agri-Food Canada)

Level: TBA

Format: TBA

Certificate: TBA

Prerequisites:

  • TBA
:: Fri., June 5 ::
10:35 AM to 12:00 PM EDT

Extracting Information from Health Data using AI

:: Link to Course :: n/a ::

Description: This workshops explores the use of artificial intelligence, especially large language models, for processing healthcare data such as electronic health records. Learners will examine how LLMs can be used to extract structured information from unstructured text, automate documentation, and uncover patterns in complex datasets. The presentation highlights benefits such as improved efficiency and richer insights, while addressing risks including hallucinations, model biases and privacy concerns. Students will study pitfalls unique to healthcare data and gain an understanding of key technical challenges: data standardization, model validation, safety monitoring, and ensuring transparency and privacy in clinically sensitive and resource constrained environments. We will talk about open source packages and open weight models and how they can be used effectively for clinical data.

Teachers: Alper Celik (HPC4Health, Centre for Computational Medicine SickKids)

Level: Introductory, Intermediate

Format: Lecture + Hands-on (first half lecture, second half hands-on)

Certificate: Attendance

Prerequisites: Basic R and BASH

:: Fri., June 5 ::
1:30 PM to 4:30 PM EDT

Introduction to R

:: Link to Course :: n/a ::

Description: This half-day session offers a brief introduction to R, with a focus on data analysis and statistics. We will discuss the following topics: the R interface, primitive data types, lists, vectors, matrices, and data frames - a crucial data type in data analysis and the trademark of the R language. Advanced topics to be covered include: basics statistics and function creation; and the basics of scripting.

Teacher: Alexey Fedoseev (SciNet, University of Toronto)

Level: Introductory

Format: Lecture + Hands-on

Certificate: Attendance

Prerequisites: Some programming experience in another programming language

Week 2
When (EDT) Course
:: Mon., June 8 ::
9:00 AM to 12:00 PM EDT

An Introduction to OpenFoam

:: Link to Course :: n/a ::

Description: TBA

Teachers: Joey Bernard (Digital Research Alliance of Canada)

Level: TBA

Format: TBA

Certificate: TBA

Prerequisites:

  • TBA
:: Mon., June 8 ::
9:00 AM to 10:25 AM EDT

System Security - Defensive Techniques

:: Link to Course :: n/a ::

Description: The System Security course explains operating system security best practices, including topics such as minimal installations, least privilege, access control and system firewalls. We will explore typical settings and software on Windows as well as Ubuntu workstation which are essential for ensuring strong security posture. The course will provide an overview of common attack vectors and corresponding defensive techniques to protect organizational information systems.

NOTE: This course will not be recorded and will be closed after summer school.

Teachers: Lucas Lapczyk (CAC, Queen's University)

Level: Intermediate

Format: Lecture with demonstrations (e.g., no real hands-on for students)

Certificate: Attendance

Prerequisites: Familiarity with Linux and/or Windows OS

:: Mon., June 8 ::
1:30 PM to 4:30 PM EDT

Reproducible Research: Practices and Tools

:: Link to Course :: n/a ::

Description: Have you ever tried to run someone else's code and it just didn't work? Have you ever been lost interpreting your colleague's data? This hands-on session will provide researchers with tools and techniques to make their research process more transparent and reusable in remote computing environments. We'll be using platforms like JupyterHub and scripting languages like Bash to demonstrate the material. In this workshop, you'll learn about:

  • Organizing your file directories
  • Writing readable metadata with README files
  • Automating your workflow with scripts
  • Capture and share your computational environment
    Using large language models (GenAI) to assist with the above

Teachers: Sarah Huber (University of Victoria), Nick Rochlin (University of Victoria), and Drew Leske (University of Victoria)

Level: Introductory

Format: Lecture + Hands-on

Certificate: Attendance

Prerequisite: Familiarity with command-line tools in a Unix environment is not a requirement for the workshop but may be helpful for some of the hands-on activities.

:: Mon., June 8 ::
1:30 PM to 2:55 PM EDT

Data security

:: Link to Course :: n/a ::

Description: Be aware. Stay secure. Join us to learn more about the tools you can use to prevent the theft of your data and possibly of your identity. Other topics of discussion will include common hacking attempts, how to recognize them, and how to avoid having your data compromised, stolen, or destroyed. We will also talk about data encryption and provide tips for when travelling with electronic devices.

Teacher: Jarno van der Kolk (University of Ottawa)

Level: Introductory

Format: Lecture

Certificate: Completion and Attendance

Prerequisites: None

:: Mon., June 8 ::
3:05 PM to 4:30 PM EDT

Leveraging Large Language Models for Academic Research: Opportunities, Workflows, and Responsible Use

:: Link to Course :: n/a ::

Description: Large Language Models (LLMs) are rapidly transforming how academic research is conducted, enabling new approaches to literature discovery, data analysis, coding, writing, and knowledge synthesis. This webinar introduces researchers to the fundamentals of LLMs, their uses in the research lifecycle as well as the limitations and ethical considerations of LLM use in academic research.

Teachers: Collin Wilson (SHARCNET, University of Guelph)

Level: TBA

Format: TBA

Certificate: Attendance

Prerequisites:

  • TBA
:: Tue., June 9 ::
9:00 AM to 12:00 PM EDT
1:30 PM to 4:30 PM EDT

C as a Second Language

:: Link to Course :: n/a ::

Description: This course introduces the fundamental concepts of programming such as conditional statement, Loops(while and for), Arrays, Pointers, Functions and Dynamic memory allocation. No programming experience will be assumed or required.

Teacher: Ross Dickson (ACENET, Dalhousie University)

Level: Introductory

Format: Lecture

Certificate: Completion and Attendance

Prerequisites: None

:: Tue., June 9 ::
9:00 AM to 10:25 AM EDT

From Policy to Practice: Preparing for the Data Deposit Requirement of the Tri-Agency RDM Policy

:: Link to Course :: n/a ::

Description: This workshop brings together perspectives from funders, libraries, researchers, and the broader research data community to examine the forthcoming updates to the data deposit requirement of the Tri Agency Research Data Management Policy and what it means in practice.
• Dominique Roche, PhD (SSHRC) will provide an overview of the updated data deposit requirement within the Tri Agency Research Data Management Policy, expected to be released this fall.
• Meghan Goodchild, PhD (Queen's University) will share how the library has been preparing to support researchers with dataset deposit, including results from the Queen's RDM survey and insights from recent outreach and pilot initiatives.
• Dr. Susan Bartels (Professor in Emergency Medicine, Queen's University) will offer a researcher's perspective on data deposit, reflecting on the benefits, challenges, and practical realities of meeting deposit requirements.

Teachers: Meghan Goodchild (Queen's University), Dominique Roche (Social Sciences and Humanities Research Council of Canada), Dr. Susan Bartels (Queen's University)

Level: Introductory

Format: Workshop

Certificate: Attendance

Prerequisites: None

:: Tue., June 9 ::
10:35 AM to 12:00 PM EDT

REBs & DRI

:: Link to Course :: n/a ::

Description: As the use of artificial intelligence (AI) becomes more common in research workflows, Research Ethics Boards (REBs) face a new challenge in fulfilling their mandate for research ethics review from the lens of the TCPS2 without being technical validators of the tools and infrastructures proposed by the Researcher.

This session will explore research ethics review principles and practices regarding how AI is being used across qualitative and quantitative research workflows, along with intersections with research data management (RDM) practices for working with sensitive data. Topics will include potential privacy concerns, including both individual and group-level risks, as well as key considerations for informed consent when human participant data is collected, processed, or analyzed using AI tools.

Finally, the session will review what guidance is currently available to support researchers in using AI responsibly. As well as the role of REBs in assessing AI-related technologies and the associated research ethics risks. This includes what falls within the scope of REB review and where additional support or expertise may be available, while highlighting how researchers can better anticipate, communicate, and mitigate ethical risks associated with their work.

Teachers: Lucy Shen (Digital Research Alliance of Canada), Victoria Smith (Digital Research Alliance of Canada)

Level: Introductory

Format: Lecture, Lecture+Hands-on

Certificate: Attendance

Prerequisites: None

:: Tue., June 9 ::
1:30 PM to 2:55 PM EDT

Indigenous Data Sovereignty in practice: The Supporting Indigenous Language Revitalization (SILR) Caretaking Directives

:: Link to Course :: n/a ::

Description:

Indigenous Data Sovereignty refers to the rights of Indigenous peoples to control the collection, ownership, and use of data about themselves, including their communities, lands, and knowledge systems. This panel style presentation provides an overview of how Indigenous Data Sovereignty has been collaboratively supported from principles to practice through the development and application of the Supporting Indigenous Language Revitalization (SILR) Caretaking Directives.

James Doiron:
This first presentation provides foundational information with respect to Indigenous Data Sovereignty, including guiding principles and resources, along with notable ongoing efforts at the University of Alberta to support Indigenous Data Sovereignty in practice.

Leah Vanderjagt:
A critical component of this work was the Data Sovereignty Agreement which was an innovation of the SILR team. Notably different from other data agreements, this second presentation provides a brief overview of the SILR project, along with a review of the Data Sovereignty Agreement and explains its unique characteristics, application in practice, and future implications for the signatories.

Sean Luyk & Kevin Glick:
This third presentation describes the development of a resource-level restriction request management feature designed in partnership with the University of Alberta's Supporting Indigenous Language Revitalization (SILR) initiative to allow data caretakers to directly approve or deny access to Indigenous knowledge recordings archived in Aviary, an audiovisual digital curation platform.

Teachers: James Doiron (University of Alberta), Sean Luyk (University of Alberta), Leah Vanderjagt (University of Alberta), Kevin Glick (AVP)

Level: Introductory

Format: Lecture+Hands-on

Certificate: Attendance

Prerequisites: None

:: Tue., June 9 ::
3:05 PM to 4:30 PM EDT

Introduction to Alliance RDM Services

:: Link to Course :: n/a ::

Description: The Introduction to Alliance RDM Services webinar will feature experts that will discuss good research data management practices. The training will begin with an overview of research data management and then provide helpful guidance and tips for the different stages of the research data lifecycle focusing on the stages of data management planning, data preservation, active data management and sharing and discovery of data. We will highlight different tools and services that are available to researchers as they embark on their research journey.

Teachers: Tristan Kuehn (Digital Research Alliance of Canada), Marcus Closen (Digital Research Alliance of Canada), Daniel Manrique-Castano (Digital Research Alliance of Canada), Maaike Bos (Digital Research Alliance of Canada)

Level: Introductory

Format: Lecture

Certificate: Attendance

Prerequisites: None

:: Wed., June 10 ::
9:00 AM to 12:00 PM EDT

Introduction to Version control (Git)

:: Link to Course :: n/a ::

Description: Git can be challenging to master with all the commands and options. A bottom up understanding of what is going on cuts through all this confusion. This course aims to give you that understanding.

Teacher: Tyson Whitehead (SHARCNET, University of Western Ontario)

Level: Introductory

Format: Online Lecture + Hands-on

Certificate: Attendance

Prerequisite: Basic understanding of Linux shell commands.

:: Wed., June 10 ::
9:00 AM to 10:25 AM EDT

Introduction to Data Curation and Deposit

:: Link to Course :: n/a ::

Description: TBA

Teachers: Elizabeth Philips (McMaster University), Isaac Pratt (McMaster University), Dylanne Dearborn (University of Toronto)

Level: TBA

Format: TBA

Certificate: Attendance

Prerequisites:

  • TBA
:: Wed., June 10 ::
10:35 AM to 12:00 PM EDT

Using Odesi & Scholars GeoPortal for research data discovery, exploration, and reuse

:: Link to Course :: n/a ::

Description:

Using Odesi for Open Canadian Social Science and Health Survey Data Analysis & Reuse (45 min)
Odesi ( https://odesi.ca) is a social science data repository and survey data exploration platform offering variable-level search and access to over 5,700 curated dataset collections, including Canadian public opinion polls, Statistics Canada census microdata and aggregate data, and other social, economic, and health surveys. Using Odesi, researchers can search across metadata, data collections, variables, perform online cross-tabulation, extract subsets, and download files for statistical analysis. Over the past year, Odesi's Data Explorer tool was updated, for data exploration and cross-tabulation analysis shared with Borealis. This course will guide participants through using Odesi to search and explore the most popular collections including the Canadian Census data collections and the Canadian Community Health Survey (PUMF) series for reuse.

Using Scholars GeoPortal for finding and accessing Ontario geospatial data (45 min)
Scholars GeoPortal ( https://geo.scholarsportal.info) is a geospatial data repository and discovery and analysis platform providing access to over 5,000 GIS datasets and more than 3 million aerial images covering Ontario. Researchers can search for data and then add and explore data using a map‑based interface with spatial identification, querying, and custom data extraction for vector and raster data. A major redevelopment project (2024-2026) is underway to modernize the GeoPortal's aging technical infrastructure. The new platform will offer improved and expanded support for geospatial data discovery and access to historical map collections. This course introduces participants to geospatial data discovery and extraction using the GeoPortal, while previewing key improvements expected in the coming year.

Teachers: Amber Leahey (OCUL, Scholars Portal, University of Toronto)

Level: Introductory

Format: Workshop

Certificate: Attendance

Prerequisites: None

:: Wed., June 10 ::
1:30 PM to 4:30 PM EDT

DASK

:: Link to Course :: n/a ::

Description: Python is a popular language because it is easy to create programs quickly with simple syntax and a "batteries included" philosophy. However, there are some drawbacks to the language. It is notoriously difficult to parallelize because of a component called the global interpreter lock, and Python programs typically take many times longer to run than compiled languages such as Fortran, C, and C++, making Python less popular for creating performance-critical programs. Dask was developed to address the first problem of parallelism. The second problem of performance can be addressed by either using modules already compiled into fast C/C++ code, such as NumPy, or by converting performance-critical parts into a compiled language such as C/C++ nearly automatically using Cython. Together Cython and Dask can be used to gain greater performance and parallelism of Python programs.

Other than having some prior experience with a programming language, preferably Python, this is a beginner level course. During the course we will program together to build out a script used to demonstrate course concepts. This will take slightly longer than half the time, while hands on exercise will use the remaining time. No Alliance account is required.

Teacher: Chris Geroux (ACENET, Dalhousie University)

Level: Introductory

Format: Lecture + follow along coding + hands on exercises

Certificate: Attendance

Prerequisites: Should have experience programming in at least one language, ideally Python.

:: Wed., June 10 ::
1:30 PM to 4:30 PM EDT

Text Mining in the Context of LLMs

:: Link to Course :: n/a ::

Description: This workshop introduces the topic of text mining and its applications. It covers different encoding mechanisms to convert text into numbers that algorithms can handle. It gives an overview of different text mining tasks, including de-identification, sentiment analysis and document clustering, and how they work with examples and live demos. There will also be references to state-of-the-art tools and libraries to conduct various text mining tasks.

NOTE: This course will not be recorded.

Teacher: Amal Khalil (CAC, Queen's University)

Level: Introductory

Format: Lecture + Hands-on

Certificate: Attendance

Prerequisites: Basic Python

:: Thu., June 11 ::
9:00 AM to 12:00 PM EDT
1:30 PM to 4:30 PM EDT

High Performance Computing in Python

:: Link to Course :: n/a ::

Description: Learn how to improve the performance and use parallel programming in Python. We will cover profiling, subprocess, numexpr, multiprocessing, MPI, and other performance enhancing techniques.

Teacher: Ramses van Zon (SciNet, University of Toronto)

Level: Intermediate

Format: Lecture + Hands-on

Certificates: Attendance and Completion

Prerequisites:

  • Basic Linux command line skills.
  • Programming experience in Python.
:: Thu., June 11 ::
9:00 AM to 12:00 PM EDT

From Raw Data to Results: No Terminal Required with UseGalaxy.ca!

:: Link to Course :: n/a ::

Description: TBA

Teachers: Charles Coulombe (Calcul Quebec) and Dr. Pierre-Etienne Jacques (Univ. of Sherbrooke)

Level: TBA

Format: TBA

Certificate: TBA

Prerequisites: TBA

:: Thu., June 11 ::
3:05 PM to 4:30 PM EDT

Introduction to data deposit and sharing in Borealis, the Canadian Dataverse Repository

:: Link to Course :: n/a ::

Description: Borealis is a bilingual, multidisciplinary research data repository supporting the secure deposit, preservation, sharing, and discovery of Canadian research data. Built on the open-source Dataverse platform and operated nationally by Scholars Portal and University of Toronto Libraries in partnership with institutions and academic libraries across Canada. Affiliated researchers can sign up to deposit and share data using repository features such file upload and ingest, support for disciplinary metadata standards, DataCite DOIs, open and controlled file access, and connected pathways for long-term preservation at institutions.
In the past year, Borealis upgraded new features for Dataverse versions 6.4-6.8 and released a new fully redesigned Data Explorer tool. The new Data Explorer offers improved data file and variable‑level exploration, cross‑tabulation and analysis tools, Codebook creation and variable documentation in Data Documentation Initiative (DDI) metadata format for import/export and data curation workflows. Additional new features include expanded metadata standards such as Croissant, a new Relation Type field supporting the DataCite schema for publication linking, and a sneak preview of the new geospatial metadata block that will support description of geospatial datasets in Borealis (coming soon).
This course will provide hands‑on training and examples of how to work with the Borealis repository's data deposit, curation, metadata, and discovery tools, with a focus on incorporating the newest features into research data management workflows.

Teachers: Amber Leahey (Scholars Portal, University of Toronto)

Level: Beginner

Format: Lecture+Hands-on

Certificate: Attendance

Prerequisites: None

:: Fri., June 12 ::
9:00 AM to 12:00 PM EDT
1:30 PM to 4:30 PM EDT

Scaling Up HPC Workflows

:: Link to Course :: n/a ::

Description: This hands-on course is designed for researchers who want to take their high-performance computing (HPC) workflows to the next level. Whether you're new to large-scale computing or looking to optimize your current practices, this course will guide you through the key steps to efficiently scale your applications on HPC systems.

Participants will begin by identifying their specific applications and learn how to properly compile their code for HPC environments. The course covers essential topics including running interactive sessions, performance tuning, and efficient batch job submission-complete with practical script examples. You'll also explore strategies for checkpointing, monitoring job progress, and effective debugging techniques.

Through a combination of lectures and hands-on exercises, this course offers real-world insights into improving performance, reducing run times, and making the most of shared computing resources. By the end, you'll be equipped with the tools and knowledge needed to run scalable, reliable, and efficient HPC workflows.

Teachers: Jaime Pinto (SciNet, University of Toronto) and Sergey Maschenko (SHARCNET, McMaster University)

Level: Introductory

Format: Lecture + Hands-on

Certificate: Attendance and Completion

Prerequisites: None

:: Fri., June 12 ::
9:00 AM to 12:00 PM EDT

Incorporating Other Languages into Python

:: Link to Course :: n/a ::

Description: We will cover how to write optimized code in C, and how to include this into your Python code. We will look at Cython, as well as pure C. If time permits, we will also look at including FORTRAN.

Teacher: Joey Bernard (Digital Research Alliance of Canada)

Level: Intermediate

Format: Lecture + Hands-on

Certificate: Attendance

Prerequisites:

  • Basic Python programming experience.
  • One knows how to use a C compiler.
:: Fri., June 12 ::
1:30 PM to 4:30 PM EDT

Data Preparation for Machine Learning

:: Link to Course :: n/a ::

Description: This course provides you with essential knowledge and skills to effectively prepare data for analysis. Starting with an overview of the Data Analytics pipeline and processes, the course explores various statistical and visualization techniques used in Exploratory and Descriptive Analytics to understand historical data. You will then delve into the art of Data Preparation, gaining expertise in data cleaning, handling missing values, detecting, and handling outliers, as well as transforming and engineering features. By the end of the course, you will be equipped with the necessary tools to ensure data quality and integrity, enabling you to make informed decisions and derive valuable insights from their data.

Teacher: Shadi Khalifa (CAC, Queen's University)

Level: Intermediate

Format: Lecture + Hands-on

Certificate: Attendance

Prerequisites:

  • Some experience and knowledge of statistics.
  • Some experience and knowledge of Python.
Week 3
When (EDT) Course
:: Mon., June 15 ::
9:00 AM to 12:00 PM EDT
1:30 PM to 4:30 PM EDT

:: Tue., June 16 ::
9:00 AM to 12:00 PM EDT
1:30 PM to 4:30 PM EDT

C++26 and HPC

:: Link to Course :: n/a ::

Description: TBA

Teachers: TBA

Level: TBA

Format: TBA

Certificate: Completion and Attendance

Prerequisites:

  • TBA
:: Mon., June 15 ::
9:00 AM to 12:00 PM EDT
1:30 PM to 4:30 PM EDT

Machine learning

:: Link to Course :: n/a ::

Description: This course provides an introduction to machine learning that enables computers to learn AI models from data without being explicitly programmed. It comprises two parts:

  • Part I covers the fundamentals of machine learning, and
  • Part II demonstrates the applications of various machine methods in solving a real world problem.

Rather than presenting the key concepts and components of machine learning in an abstract way, this course introduces them with a small number of examples. By using plotting and animations, insight into some of the mechanics of machine learning can be had. Furthermore, the student will gain practical skills in a case study, in which each step of developing a machine learning project is presented. By the end of this course, the student will have a solid understanding and experience with some of the fundamentals of machine learning enabling subsequent exploration.

Teacher: Weiguang Guan (SHARCNET, McMaster University)

Level: Introductory to Intermediate

Format: Lecture

Certificate: Completion and Attendance

Prerequisites:

  • Data preparation or equivalent knowledge.
  • Basic Python knowledge and experience.
  • Knowledge and experience with Tensorflow and Scikit-learn would also be helpful.
:: Tue., June 16 ::
9:00 AM to 12:00 PM EDT
1:30 PM to 4:30 PM EDT

:: Wed., June 17 ::
9:00 AM to 12:00 PM EDT

Introduction to Neural Network Programming

:: Link to Course :: n/a ::

Description (Parts 1 and 2): Introduction of neural network programming concepts, theory and techniques. The class material will being at an introductory level, intended for those with no experience with neural networks, eventually covering intermediate concepts.

Description (Part 3): This part will continue the development of the neural network programming approaches from Parts 1 & 2. This part will focus on methods used to generate sequences: LSTM networks, sequence-to-sequence networks, and transformers.

Teacher: Erik Spence (SciNet, University of Toronto)

Level: Introductory

Format: Lecture + Hands-on

Certificate: Attendance

Prerequisites:

  • Experience with Python will be assumed. (This course is being taught assuming this.)
  • No prior experience with the Keras neural framework is expected. (The Keras neural framework will be used for neural network programming.)
:: Wed., June 17 ::
9:00 AM to 12:00 PM EDT
1:30 PM to 4:30 PM EDT

:: Thu., June 18 ::
9:00 AM to 12:00 PM EDT
1:30 PM to 4:30 PM EDT

:: Fri., June 19 ::
9:00 AM to 12:00 PM EDT
1:30 PM to 4:30 PM EDT

Programming GPUs with CUDA

:: Link to Course :: n/a ::

Description: This is an introductory course covering programming and computing on GPUs - graphics processing units - which are an increasingly common presence in massively parallel computing architectures. The basics of GPU programming will be covered, and students will work through a number of hands on examples. The structuring of data and computations that makes full use of the GPU will be discussed in detail. Students should be able to leave the course with the knowledge necessary to begin developing their own GPU applications.

This course will have completion activities (Certification Quiz and Home Assignment). Upon successful completion of the activities, a Certificate of course Completion will be issued. No Certificates of Attendance will be issued for this course.

IMPORTANT: this course will use an Alliance cluster for hands-on activities, and so it is restricted to Alliance users only. If you are a Canadian researcher or grad student, you can get an Alliance account for free, typically within a few days.

Teacher: Sergey Mashchenko (SHARCNET, McMaster University) and Pawel Pomorski (SHARCNET, University of Waterloo)

Level: Introductory

Format: Lecture + Hands-on

Certificates: Completion

Prerequisites:

  • Alliance account
  • Basic C and/or C++ programming experience.
:: Wed., June 17 ::
9:00 AM to 12:00 PM EDT
1:30 PM to 4:30 PM EDT

:: Thu., June 18 ::
9:00 AM to 12:00 PM EDT
1:30 PM to 4:30 PM EDT

Introduction to Julia

:: Link to Course :: n/a ::

Description: Julia is becoming increasingly popular for scientific computing. One may use it for prototyping as Matlab, R and Python for productivity, while gaining the same performance as compiled languages such as C/C++ and Fortran. The language is designed for both prototyping and performance, as well as simplicity. This is an introductory course on julia. Students will be able to get started quickly with the basics, in comparison with other similar languages such as Matlab, R, Python and Fortran and move on to learn how to write code that can run in parallel on multi-core and cluster systems through examples.

There are five homework assignments., each has a full mark of 2 points. Participants who complete the assignments and receive a passing grade of 80% (8 points) will receive a certificate of completion. No certificates of attendance will be issued for this course.

Teacher: Ed Armstrong, (SHARCNET, University of Guelph)

Level: Introductory

Format: Lecture + Hands-on

Certificate: Completion and Attendance

Prerequisites: None

:: Wed., June 17 ::
1:30 PM to 2:55 PM EDT

TDM at the University of Toronto Libraries: Services and Platforms

:: Link to Course :: n/a ::

Description: This 90-minute workshop will provide attendees with an overview of text and data mining (TDM) at the University of Toronto Libraries. We'll cover licensing-including emergent AI licensing clauses and standard license restrictions; APIs and access; content types and availability; and platforms that support TDM, especially ProQuest's TDM Studio. You'll leave knowing what is available at academic institutions, how to begin building a corpus, and basic modes of textual analysis.

Teachers: Neil Aitken (University of Toronto), Leslie Barnes (University of Toronto), Leanne Trimble (University of Toronto)

Level: Beginner

Format: Lecture / Demo

Certificate: Attendance

Prerequisites: None

:: Wed., June 17 ::
3:05 PM to 4:30 PM EDT

Exploring the intersection of AI, Copyright, and RDM in the Canadian Context

:: Link to Course :: n/a ::

Description: This presentation examines the evolving intersection of AI, copyright, and research data management (RDM) in the Canadian context. It explores how generative AI challenges traditional concepts of authorship, ownership, and fair dealing, drawing on emerging legal cases, government consultations, and international developments. The session highlights risks to creators from unlicensed data scraping and style mimicry, and reviews emerging technical and policy responses. Finally, the presentation positions RDM as a critical foundation for accountability, data provenance, metadata integrity, and trusted research practices in an AI‑driven information ecosystem.

Teachers: Jeff Moon (Compute Ontario), Lucia Costanzo (University of Guelph)

Level: TBA

Format: TBA

Certificate: Attendance

Prerequisites:

  • TBA
:: Thu., June 18 ::
9:00 AM to 12:00 PM EDT
1:30 PM to 4:30 PM EDT

Scientific Visualization with ParaView

:: Link to Course :: n/a ::

Description: During this workshop, we will learn about ParaView, a free and open-source visualization tool for creating 3D visualizations of your datasets. In this interactive workshop you will get familiar with how ParaView works and at the end you should be able to generate basic visualizations of the demo data.

Teachers: Jarno van der Kolk (University of Ottawa)

Level: TBA

Format: TBA

Certificate: Completion and Attendance

Prerequisites:

  • TBA
:: Fri., June 19 ::
9:00 AM to 12:00 PM EDT
1:30 PM to 4:30 PM EDT

Using Apptainer Containers

:: Link to Course :: n/a ::

Description: Apptainer is a secure container technology designed to be used on for high performance compute clusters. This workshop will focus on how to use Apptainer as well as how to make use of tools such as Conda and Spack within Apptainer. By the end of these sessions, one will have learnt about Apptainer and how it is installed and used on our computer clusters, how to build an Apptainer container image, how to install tools such as Conda/Spack from inside an Apptainer container shell, and,
how to use Apptainer containers within job submission scripts.

Teacher: Paul Preney (SHARCNET, University of Windsor)

Level: Introductory

Format: Lecture + Hands-on

Certificate: Completion and Attendance

Prerequisite: Basic knowledge of Linux shell and how to run programs from the shell.