
Description: Python is a popular language because it is easy to create programs quickly with simple syntax and a "batteries included" philosophy. However, there are some drawbacks to the language. It is notoriously difficult to parallelize because of a component called the global interpreter lock, and Python programs typically take many times longer to run than compiled languages such as Fortran, C, and C++, making Python less popular for creating performance-critical programs. Dask was developed to address the first problem of parallelism. The second problem of performance can be addressed by either using modules already compiled into fast C/C++ code, such as NumPy, or by converting performance-critical parts into a compiled language such as C/C++ nearly automatically using Cython. Together Cython and Dask can be used to gain greater performance and parallelism of Python programs.
Other than having some prior experience with a programming language, preferably Python, this is a beginner level course. During the course we will program together to build out a script used to demonstrate course concepts. This will take slightly longer than half the time, while hands on exercise will use the remaining time. No Alliance account is required.
Teacher: Chris Geroux (ACENET, Dalhousie University)
Level: Intermediate
Format: Lecture + follow along coding + hands on exercises
Certificate: Attendance
Prerequisites: Should have experience programming in at least one language, ideally Python.

Description:
UseGalaxy.ca is Canada’s public instance of the international Galaxy platform, hosted on the Alliance's infrastructure. It is offering an accessible, web‑based environment that empowers researchers to run complex analyses without needing programming or command‑line expertise. Originally built for bioinformatics, Galaxy now supports a wide range of domains, including image analysis, machine learning, chemistry, astronomy, social sciences, and linguistics. Galaxy is designed to make reproducible and transparent research available to all.
This workshop will first briefly describe the UseGalaxy.ca project and infrastructure, then provide a guided, hands‑on introduction to the platform: uploading data, running tools, building workflows, visualizing data and interpreting results directly in the browser. Participants will experience how the platform eases analysis while maintaining scientific rigour and simplify reproducibility.
We will also present advanced features such as interactive tools Jupyter and RStudio as well as data collection enabling large scale analyses, then conclude with an open Q&A to address project‑specific needs, and discuss how UseGalaxy.ca can support diverse research communities.
Teachers: Charles Coulombe (Calcul Quebec) and Dr. Pierre-Etienne Jacques (Univ. of Sherbrooke)
Level: Introductory
Format: Lecture + Hands-on
Certificate: Attendance
Prerequisites: None

Description: This workshop brings together perspectives from funders, libraries, researchers, and the broader research data community to examine the forthcoming updates to the data deposit requirement of the Tri Agency Research Data Management Policy and what it means in practice.
• Dominique Roche, PhD (SSHRC) will provide an overview of the updated data deposit requirement within the Tri Agency Research Data Management Policy, expected to be released this fall.
• Meghan Goodchild, PhD (Queen’s University) will share how the library has been preparing to support researchers with dataset deposit, including results from the Queen’s RDM survey and insights from recent outreach and pilot initiatives.
• Dr. Susan Bartels (Professor in Emergency Medicine, Queen’s University) will offer a researcher’s perspective on data deposit, reflecting on the benefits, challenges, and practical realities of meeting deposit requirements.
Teachers: Meghan Goodchild (Queen's University), Dominique Roche (Social Sciences and Humanities Research Council of Canada), Dr. Susan Bartels (Queen's University)
Level: Introductory
Format: Workshop
Certificate: Attendance
Prerequisites: None

Description:
Indigenous Data Sovereignty refers to the rights of Indigenous peoples to control the collection, ownership, and use of data about themselves, including their communities, lands, and knowledge systems. This panel style presentation provides an overview of how Indigenous Data Sovereignty has been collaboratively supported from principles to practice through the development and application of the Supporting Indigenous Language Revitalization (SILR) Caretaking Directives.
James Doiron:
This first presentation provides foundational information with respect to Indigenous Data Sovereignty, including guiding principles and resources, along with notable ongoing efforts at the University of Alberta to support Indigenous Data Sovereignty in practice.
Leah Vanderjagt:
A critical component of this work was the Data Sovereignty Agreement which was an innovation of the SILR team. Notably different from other data agreements, this second presentation provides a brief overview of the SILR project, along with a review of the Data Sovereignty Agreement and explains its unique characteristics, application in practice, and future implications for the signatories.
Sean Luyk & Kevin Glick:
This third presentation describes the development of a resource-level restriction request management feature designed in partnership with the University of Alberta’s Supporting Indigenous Language Revitalization (SILR) initiative to allow data caretakers to directly approve or deny access to Indigenous knowledge recordings archived in Aviary, an audiovisual digital curation platform.
Teachers: James Doiron (University of Alberta), Sean Luyk (University of Alberta), Leah Vanderjagt (University of Alberta), Kevin Glick (AVP)
Level: Introductory
Format: Lecture+Hands-on
Certificate: Attendance
Prerequisites: None

Description: The Introduction to Alliance RDM Services webinar will feature experts that will discuss good research data management practices. The training will begin with an overview of research data management and then provide helpful guidance and tips for the different stages of the research data lifecycle focusing on the stages of data management planning, data preservation, active data management and sharing and discovery of data. We will highlight different tools and services that are available to researchers as they embark on their research journey.
Teachers: Tristan Kuehn (Digital Research Alliance of Canada), Marcus Closen (Digital Research Alliance of Canada), Daniel Manrique-Castano (Digital Research Alliance of Canada), Maaike Bos (Digital Research Alliance of Canada)
Level: Introductory
Format: Lecture
Certificate: Attendance
Prerequisites: None

Description:
This course provides an introduction to research data curation and deposit, based on an upcoming Compute Ontario data deposit and curation training course for researchers and research staff. This session will introduce researchers to curation and deposit through a hands on evaluation of a dummy dataset and prepare researchers to:
• Understand the principles behind FAIR and responsible research data sharing
• Make informed decisions about which data to share, where, and under what conditions
• Develop and deposit high-quality datasets with appropriate metadata, documentation, and structure
• Apply disciplinary standards and best practices where relevant
• Navigate institutional, funder, ethics, and journal requirements surrounding data deposits with confidence
Teachers: Elizabeth Philips (McMaster University), Isaac Pratt (McMaster University), Dylanne Dearborn (University of Toronto)
Level: Introductory
Format: Lecture + Hands-on
Certificate: Attendance
Prerequisites: None

Description: Borealis is a bilingual, multidisciplinary research data repository supporting the secure deposit, preservation, sharing, and discovery of Canadian research data. Built on the open-source Dataverse platform and operated nationally by Scholars Portal and University of Toronto Libraries in partnership with institutions and academic libraries across Canada. Affiliated researchers can sign up to deposit and share data using repository features such file upload and ingest, support for disciplinary metadata standards, DataCite DOIs, open and controlled file access, and connected pathways for long-term preservation at institutions.
In the past year, Borealis upgraded new features for Dataverse versions 6.4-6.8 and released a new fully redesigned Data Explorer tool. The new Data Explorer offers improved data file and variable‑level exploration, cross‑tabulation and analysis tools, Codebook creation and variable documentation in Data Documentation Initiative (DDI) metadata format for import/export and data curation workflows. Additional new features include expanded metadata standards such as Croissant, a new Relation Type field supporting the DataCite schema for publication linking, and a sneak preview of the new geospatial metadata block that will support description of geospatial datasets in Borealis (coming soon).
This course will provide hands‑on training and examples of how to work with the Borealis repository’s data deposit, curation, metadata, and discovery tools, with a focus on incorporating the newest features into research data management workflows.
Teachers: Amber Leahey (Scholars Portal, University of Toronto)
Level: Beginner
Format: Lecture+Hands-on
Certificate: Attendance
Prerequisites: None

Description: As the use of artificial intelligence (AI) becomes more common in research workflows, Research Ethics Boards (REBs) face a new challenge in fulfilling their mandate for research ethics review from the lens of the TCPS2 without being technical validators of the tools and infrastructures proposed by the Researcher.
This session will explore research ethics review principles and practices regarding how AI is being used across qualitative and quantitative research workflows, along with intersections with research data management (RDM) practices for working with sensitive data. Topics will include potential privacy concerns, including both individual and group-level risks, as well as key considerations for informed consent when human participant data is collected, processed, or analyzed using AI tools.
Finally, the session will review what guidance is currently available to support researchers in using AI responsibly. As well as the role of REBs in assessing AI-related technologies and the associated research ethics risks. This includes what falls within the scope of REB review and where additional support or expertise may be available, while highlighting how researchers can better anticipate, communicate, and mitigate ethical risks associated with their work.
Teachers: Lucy Shen (Digital Research Alliance of Canada), Victoria Smith (Digital Research Alliance of Canada)
Level: Introductory
Format: Lecture, Lecture+Hands-on
Certificate: Attendance
Prerequisites: None

Description: Have you ever tried to run someone else’s code and it just didn’t work? Have you ever been lost interpreting your colleague’s data? This hands-on session will provide researchers with tools and techniques to make their research process more transparent and reusable in remote computing environments. We’ll be using platforms like JupyterHub and scripting languages like Bash to demonstrate the material. In this workshop, you’ll learn about:
- Organizing your file directories
- Writing readable metadata with README files
- Automating your workflow with scripts
- Capture and share your computational environment
Using large language models (GenAI) to assist with the above
Teachers: Sarah Huber (University of Victoria), Nick Rochlin (University of Victoria), and Drew Leske (University of Victoria)
Level: Introductory
Format: Lecture + Hands-on
Certificate: Attendance
Prerequisite: Familiarity with command-line tools in a Unix environment is not a requirement for the workshop but may be helpful for some of the hands-on activities.

Description: This hands-on course is designed for researchers who want to take their high-performance computing (HPC) workflows to the next level. Whether you're new to large-scale computing or looking to optimize your current practices, this course will guide you through the key steps to efficiently scale your applications on HPC systems.
Participants will begin by identifying their specific applications and learn how to properly compile their code for HPC environments. The course covers essential topics including running interactive sessions, performance tuning, and efficient batch job submission-complete with practical script examples. You'll also explore strategies for checkpointing, monitoring job progress, and effective debugging techniques.
Through a combination of lectures and hands-on exercises, this course offers real-world insights into improving performance, reducing run times, and making the most of shared computing resources. By the end, you'll be equipped with the tools and knowledge needed to run scalable, reliable, and efficient HPC workflows.
Teachers: Jaime Pinto (SciNet, University of Toronto) and Sergey Maschenko (SHARCNET, McMaster University)
Level: Introductory
Format: Lecture + Hands-on
Certificate: Completion
Prerequisites: None

Description: The System Security course explains operating system security best practices, including topics such as minimal installations, least privilege, access control and system firewalls. We will explore typical settings and software on Windows as well as Ubuntu workstation which are essential for ensuring strong security posture. The course will provide an overview of common attack vectors and corresponding defensive techniques to protect organizational information systems.
NOTE: This course will not be recorded and will be closed after summer school.
Teachers: Lucas Lapczyk (CAC, Queen's University)
Level: Intermediate
Format: Lecture with demonstrations (e.g., no real hands-on for students)
Certificate: Attendance
Prerequisites: Familiarity with Linux and/or Windows OS

Description:
Using Odesi for Open Canadian Social Science and Health Survey Data Analysis & Reuse (45 min)
Odesi (https://odesi.ca) is a social science data repository and survey data exploration platform offering variable-level search and access to over 5,700 curated dataset collections, including Canadian public opinion polls, Statistics Canada census microdata and aggregate data, and other social, economic, and health surveys. Using Odesi, researchers can search across metadata, data collections, variables, perform online cross-tabulation, extract subsets, and download files for statistical analysis. Over the past year, Odesi's Data Explorer tool was updated, for data exploration and cross-tabulation analysis shared with Borealis. This course will guide participants through using Odesi to search and explore the most popular collections including the Canadian Census data collections and the Canadian Community Health Survey (PUMF) series for reuse.
Using Scholars GeoPortal for finding and accessing Ontario geospatial data (45 min)
Scholars GeoPortal (https://geo.scholarsportal.info) is a geospatial data repository and discovery and analysis platform providing access to over 5,000 GIS datasets and more than 3 million aerial images covering Ontario. Researchers can search for data and then add and explore data using a map‑based interface with spatial identification, querying, and custom data extraction for vector and raster data. A major redevelopment project (2024–2026) is underway to modernize the GeoPortal’s aging technical infrastructure. The new platform will offer improved and expanded support for geospatial data discovery and access to historical map collections. This course introduces participants to geospatial data discovery and extraction using the GeoPortal, while previewing key improvements expected in the coming year.
Teachers: Amber Leahey (OCUL, Scholars Portal, University of Toronto)
Level: Introductory
Format: Workshop
Certificate: Attendance
Prerequisites: None