All 2022 Compute Ontario Summer School-wide notices, e.g., reminders about course registrations, will posted in this forum. There are two ways to make use of this forum:

  1. You can browse all posts without enrolling in this course.
  2. If you would like to receive posts via email, then enrol into this course. (If you unenrol you will no longer receive emails. If you've turned off the ability to receive emails in this site's Profile settings, then turn such back on in order to receive emails.)
To enrol in this course, go to this page.

This workshop is intended to give a comprehensive introduction to high-performance computing (HPC) resources available throughout the Canadian ecosystem and how to use them effectively. By the end of this workshop, participants will know how to use the UNIX command line to operate a computer, connect to a cluster, write simple shell scripts, submit, and manage jobs on a cluster using a scheduler, transfer files, and use software through environment modules. No prerequisites are required for this course. This course is intended for a beginner audience. 

This workshop will be delivered in 1 session:

  • Session 1: Monday, May 30, 2022 from 9:00 AM-12:00 PM EDT
Registration opens: May 16, 2022 at 12 PM EDT.

This workshop is intended to provide participants with a hands-on introduction to the popular programming language Python. This workshop focuses on the basics of programming and how to apply them in the context of Python. No prerequisites are required for this course. This course is intended for a beginner audience. 

This workshop will be delivered in 2 sessions:

  • Session 1: Monday, May 30, 2022 from 1:00-4:00 PM EDT 
  • Session 2: Wednesday, June 1, 2022 from 1:00-4:00 PM EDT
Registration opens: May 16, 2022 at 12 PM EDT.

Jupyter is a web-based interactive environment for notebook and code development commonly used for data analytics using Python, R, Julia, etc. This presentation will demonstrate creating and using notebooks; loading and unloading software modules; interactive computing with supported languages, e.g., Python, R, Julia, and C++.

This presentation will take place on Thursday, June 2 from 12:00 PM to 1:00 PM EDT.

Registration opens: May 16, 2022 at 12:00 PM EDT

Bioinformatics is a multidisciplinary field that develops methods for turning the so-called biological big data into knowledge. Although most problems in biology are not as embarrassingly parallelizable as the physics codes that HPC systems are usually designed for, this has been starting to change in recent years. Metagenomics is an emerging field in bioinformatics that investigates microbes inhabiting oceans, soils, the human body, etc. with sequencing technologies. In this course, we will first tune your HPC knowledge/skills towards bioinformatics computing and then a typical metagenomics pipeline will be explored to introduce common tools used in bioinformatic analysis and to show how they can be run in an HPC environment.

This workshop will be delivered in 3 sessions:

  • Session 1: Monday, June 6, 2022 from 10:00 AM - 12:00 PM EDT 
  • Session 2: Wednesday, June 8, 2022 from 10:00 AM - 12:00 PM EDT  
  • Session 3: Friday, June 10, 2022 from 10:00 AM - 12:00 PM EDT
Registration opens: May 23, 2022 at 12:00 PM EDT.

This is a short course for beginners with no previous knowledge and experience of R. We will begin with loading a dataset in a CSV file into R and doing some basic statistics with the dataset. We then look into how to group those R commands into an R function to avoid typing commands repeatedly when we need to redo the calculations. We move on to show how to automate the process of loading a bunch of CSV files and perform the same statistical tasks with R built-in commands and for loops. Naturally we start looking into the fundamental concepts of R programming, with focus on manipulation of lists, vectors, arrays and data frames. As an important feature and strength in statistical computing, we give a very brief introduction to the random number generators and statistical functions. We touch upon linear regression with simple, easy to understand examples. While walking through examples with datasets, we will emphasize on the most commonly used data structure - data frames - and show by examples how to manipulate cells, rows and columns in data frames. At the end of the course, we will dive into the R package ggplot2 from the amazing R tidyverse collection for data science packages. We will be demonstrating how to easily construct great graphics using Hadley Wickham's layered grammar of graphics.  No previous programming skill is required.

This workshop will be delivered in 3 sessions:

  • Session 1: Monday, June 6, 2022 from 1:30 PM to 3:30 PM EDT 
  • Session 2: Wednesday, June 8, 2022 from 1:30 PM to 3:30 PM EDT  
  • Session 3: Friday, June 10, 2022 from 1:30 PM to 3:30 PM EDT

Registration opens: May 23, 2022 at 12:00 PM EDT.

This interactive online workshop is an introduction to the world of machine learning (ML), covering some supervised learning algorithms and when and how to use them. 

It begins by introducing the data pipeline and its processes, before moving on to statistical and visualization approaches to conduct exploratory and descriptive analytics on data in answering the question “What happened in the past?”. From there, participants will explore the art of data preparation, including data cleaning, missing values, outlier detection, and feature transformation and engineering.

Next, we will introduce predictive analytics to answer the question “What will happen?”. We will cover techniques for classifying and predicting data for the supervised learning algorithm, such as k-NN, Naïve Bayes, Decision Tree and Random Forest, and provide guidance in deciding which ones to use. Finally, participants will learn about statistical evaluation methods used in comparing the performance of predictive modelling techniques.

This workshop balances theory and practice. Participants will use practical concepts of machine learning applications to understand real-world situations. 

Topics 

  • Data preparation
  • Machine learning theory
  • Machine learning process
  • Machine learning algorithms 
  • Model evaluation

Course Prerequisites: Participants are expected to be familiar with the programming language, python, and have a basic understanding of data preparation. 

This workshop will be delivered in 3 sessions:

  • Session 1: Monday, June 13, 2022 from 9:00 AM - 12:00 PM EDT
    • Data Preparation Theory & Demo
  • Session 2: Wednesday, June 15, 2022 from 9:00 AM - 12:00 PM EDT     
    • Modelling Theory
  • Session 3: Friday, June 17, 2022 from 9:00 AM - 12:00 PM EDT
    • Modelling Practice 

Registration opens: May 30, 2022 at 12:00 PM EDT.

Version control is an important tool for tracking and safe-keeping articles and source code. Nothing is ever lost, and it is always possible to go back in time to an earlier version. It is an ideal tool for collaboration since multiple persons can work on the same project and it keeps a record of which change has been done by which person. In this course, we will look at the most popular version control software, Git, which is the basis for well-known sites such as GitHub and Gitlab. We will use Github during the course. Students will need to be familiar with basic command line to participate. (This course will be taught in English.)

This workshop will be delivered in 3 sessions:

  • Session 1: Monday, June 13, 2022 from 1:30 PM - 3:00 PM EDT 
  • Session 2: Wednesday, June 15, 2022 from 1:30 PM - 3:00 PM EDT 
  • Session 3: Friday, June 17, 2022 from 1:30 PM - 3:00 PM EDT
Registration opens: May 30, 2022 at 12 PM EDT.

Le contrôle de version est un outil important pour le suivi et la sauvegarde des articles et du code source. Rien n'est jamais perdu, et il est toujours possible de revenir à une version antérieure. C'est un outil idéal pour la collaboration puisque plusieurs personnes peuvent travailler sur le même projet et qu'il garde une trace de chaque modification effectuée par chaque personne. Dans ce cours, nous examinerons le logiciel de contrôle de version le plus populaire, Git, qui est à la base de sites bien connus comme GitHub et Gitlab. Nous utiliserons Github pendant le cours. Les étudiants doivent comprendre la ligne de commande pour ce cours. (Le contenu de ce cours vous sera présenté en anglais seulement). 

Ce cours sera présenté en 3 ateliers:

  • Ateliers 1: Lundi 13 juin 2022 de 13h30 à 15h00 HAE
  • Ateliers 2: Mercredi 15 juin 2022 de 13h30 à 15h00 HAE
  • Ateliers 3: Vendredi 17 juin 2022 de 13h30 à 15h00 HAE
Ouverture des inscriptions: 30 mai 2022 à 12 h HAE.

Python has become one of the most popular programming languages in scientific computing. It is high level enough that learning it is easy, and coding with it is significantly faster than other programming languages. However, the performance of pure Python programs is often sub-optimal, and might hinder your research. In this course, we will show you some ways to identify performance bottlenecks, improve slow blocks of code, and extend Python with compiled code. You’ll learn various ways to optimise and parallelise Python programs, particularly in the context of scientific and high performance computing. (Prerequisite knowledge: know what classes and functions are; familiarity with Jupyter Notebook; basic console use; and comfortable with Python software carpentry material.)

This workshop will be delivered in 3 sessions:

  • Session 1: Monday, June 27, 2022 from 10:30 AM to 12:00 PM EDT 
  • Session 2: Wednesday, June 29, 2022 from 10:30 AM to 12:00 PM EDT  
  • Session 3: Thursday, June 30, 2022 from 10:30 AM to 12:00 PM EDT (Friday, July 1 is a holiday.)

Registration opens: June 13, 2022 at 12:00 PM EDT.

This course will cover modern C++ programming aspects including parallel programming (e.g., parallel algorithms, synchronization, and threads), constructs that will help improve code run-time performance (e.g., constexpr), as well as some useful aspects of newer C++ standards. (Prerequisite knowledge: Experience writing C++ programs.)

This workshop will be delivered in 6 sessions:

    • Session 1: Monday, June 27, 2022 from 1:30 PM to 3:00 PM EDT
    • Session 2: Wednesday, June 29, 2022 from 1:30 PM to 3:00 PM EDT
    • Session 3: Thursday, June 30, 2022 from 1:30 PM to 3:00 PM EDT (NOTE: Friday, July 1 is a holiday.)
    • Session 4: Monday, July 4, 2022 from 1:30 PM to 3:00 PM EDT
    • Session 5: Wednesday, July 6, 2022 from 1:30 PM to 3:00 PM EDT
    • Session 6: Friday, July 8, 2022 from 1:30 PM to 3:00 PM EDT

    Registration opens: June 13, 2022 at 12:00 PM EDT.

    This session will provide a high-level introduction to Research Data Management (RDM) and its key role in the Canadian Digital Research Infrastructure (DRI) ecosystem, alongside ARC and Research Software. Funder requirements and the growing acceptance of RDM as a research best practice will be discussed, as will the benefits to researchers of good data management. Resources to support researchers manage their data will be highlighted, including those provided locally, regionally, and nationally. In particular, this session will introduce concepts including: the research life cycle, the data storage continuum, Data Management Plans (DMPs), data deposit for discovery and access, sensitive data, and long-term preservation of research data. Key questions we’ll answer include: "why should I manage my data?”, "what are my options?”, and “who do I turn to for more help?”.

    This workshop will be delivered in 1 session:

    • Session 1: Tuesday, June 28, 2022 from 1:30 PM - 3:00 PM EDT
    Registration opens: June 13, 2022 at 12:00 EDT.

    This is an introductory course covering programming and computing on GPUs - graphics processing units - which are an increasingly common presence in massively parallel computing architectures. The basics of GPU programming will be covered, and students will work through a number of hands on examples. The structuring of data and computations that makes full use of the GPU will be discussed in detail. The course covers some new features available on GPUs installed on Graham and Cedar. Students should be able to leave the course with the knowledge necessary to begin developing their own GPU applications. (Prerequisite knowledge: C/C++ scientific programming, experience editing and compiling code in a Linux environment. Some experience with CUDA and/or OpenMP a plus.)

    This workshop will be delivered in 9 sessions:

    • Session 1: Monday, July 4, 2022 from 10:30 AM to 12:00 PM EDT 
    • Session 2: Wednesday, July 6, 2022 from 10:30 AM to 12:00 PM EDT  
    • Session 3: Friday, July 8, 2022 from 10:30 AM to 12:00 PM EDT
    • Session 4: Monday, July 11, 2022 from 10:30 AM to 12:00 PM EDT
    • Session 5: Wednesday, July 13, 2022 from 10:30 AM to 12:00 PM EDT
    • Session 6: Friday, July 15, 2022 from 10:30 AM to 12:00 PM EDT
    • Session 7: Monday, July 18, 2022 from 10:30 AM to 12:00 PM EDT
    • Session 8: Wednesday, July 20, 2022 from 10:30 AM to 12:00 PM EDT
    • Session 9: Friday, July 22, 2022 from 10:30 AM to 12:00 PM EDT

    Registration opens: June 20, 2022 at 12:00 PM EDT.

    During this workshop, we will learn about Plotly which is a popular Python library that is great for 2D visualizations, and ParaView, a free and open-source visualization tool for creating 3D visualizations of your datasets. In this interactive workshop you will get familiar with how ParaView works and at the end you should be able to generate basic visualizations of the demo data. (This course will be taught in English.)

    This workshop will be delivered in 6 sessions:

    • Session 1: Monday, July 11, 2022 from 1:30 PM - 3:00 PM EDT
    • Session 2: Wednesday, July 13, 2022 from 1:30 PM - 3:00 PM EDT
    • Session 3: Friday, July 15, 2022 from 1:30 PM - 3:00 PM EDT
    • Session 4: Monday, July 18, 2022 from 1:30 PM - 3:00 PM EDT
    • Session 5: Wednesday, July 20, 2022 from 1:30 PM - 3:00 PM EDT
    • Session 6: Friday, July 22, 2022 from 1:30 PM - 3:00 PM EDT

    Course Prerequisites: Familiarity with the Python is a benefit, but not required.

    Registration opens: June 27, 2022 at 12:00 EDT.

    Au cours de cet atelier, nous examinerons Plotly, une bibliothèque Python populaire, idéale pour les visualisations 2D, et ParaView, un outil de visualisation gratuit et open-source permettant de créer des visualisations 3D de vos ensembles de données. Au cours de cet atelier interactif, vous vous familiariserez avec le fonctionnement de ParaView et, à la fin, vous devriez être en mesure de générer des visualisations de base des données de démonstration. (Le contenu du ce cours vous sera présenté en anglais seulement, mais questions en français sont bienvenues)

    Ce cours sera présenté en 6 ateliers:

    • Ateliers 1: Lundi 11 juillet 2022 de 13h30 à 15h00 HAE
    • Ateliers 2: Mercredi 13 juillet 2022 de 13h30 à 15h00 HAE
    • Ateliers 3: Vendredi 15 juillet 2022 de 13h30 à 15h00 HAE
    • Ateliers 4: Lundi 18 juillet 2022 de 13h30 à 15h00 HAE
    • Ateliers 5: Mercredi 20 juillet 2022 de 13h30 à 15h00 HAE
    • Ateliers 6: Vendredi 22 juillet 2022 de 13h30 à 15h00 HAE

    Conditions préalables au cours : Des connaissances du langage Python sont un avantage, mais ne sont pas obligatoires.

    Ouverture des inscriptions: 27 juin 2022 à 12 h HAE.

    As the brilliant Val said in Tremors, “we plan ahead, that way we don’t do anything right now.” Planning ahead is never a bad thing! Taking the time to plan out the details of managing your project’s research data is also never a bad thing, and is definitely never a waste of time.

    Data Management Plans, or DMPs, are documents that lay out the who, what, where, when, why, and how regarding your project’s research data. DMPs help researchers think through all aspects of research data management before the data is actually collected. DMPs are intended to be living documents that can be updated or modified at any point within your research project’s lifecycle. While creating a DMP may seem like a tedious, administrative task, it’s not! 

    This workshop will walk attendees through the creation of a DMP using the DMP Assistant, which is a free, Canadian-based online tool that is simple and easy to use. We’ll also talk briefly about the Tri-Agency RDM Policy and how it’s related to DMPs.

    This workshop will be delivered in 1 session:

    • Session 1: Tuesday, July 12, 2022 from 1:30 PM to 3:00 PM EDT 

    Registration opens: June 27, 2022 at 12:00 PM EDT.

    This course will cover how to use a variety of SLURM commands on Digital Research Alliance of Canada's clusters. This single-session course will be delivered on Thursday, July 21, 2022 from 10:30 AM to 11:30 AM EDT.

    Registration opens: July 4, 2022 at 12:00 PM EDT.

    Learn the basics of Message Passing Interface (MPI) programming. Examples and exercises will be based on parallelization of common scientific computing problems. (Prerequisites: C/C++ or Fortran programming)

    This workshop will be delivered in 3 sessions:

    • Session 1: Monday, July 25, 2022 from 10:30 AM - 12:00 PM EDT 
    • Session 2: Wednesday, July 27, 2022 from 10:30 AM - 12:00 PM EDT  
    • Session 3: Friday, July 29, 2022 from 10:30 AM - 12:00 PM EDT

    Registration opens: July 11, 2022 at 12:00 PM EDT.

    Be aware. Stay secure. Join us to learn more about the tools you can use to prevent the theft of your data and possibly of your identity. Other topics of discussion will include common hacking attempts, how to recognize them, and how to avoid having your data compromised, stolen, or destroyed. We will also talk about data encryption and provide tips for when travelling with electronic devices. (This course will be taught in English.)

    This workshop will be delivered in 2 sessions:

    • Session 1: Monday, July 25, 2022 from 1:30 PM - 3:00 PM EDT
    • Session 2: Wednesday, July 27, 2022 from 1:30 PM - 3:00 PM EDT
    Registration opens: July 11, 2022 at 12:00 PM EDT.

    Soyez futés. Pensez sécurité. Joignez-vous à nous pour en savoir plus sur les outils à utiliser pour prévenir le vol des données et possiblement de votre identité. Nous aborderons les tentatives de piratage, comment les reconnaître et éviter la compromission ou le vol des données. Nous discuterons aussi du chiffrement des données et les conseils à observer lors du voyage avec des appareils électroniques. (Le contenu de ce cours vous sera présenté en anglais seulement.)

    Ce cours sera présenté en 2 ateliers:

    • Ateliers 1: Lundi 25 juillet 2022 de 13h30 à 15h00 HAE
    • Ateliers 2: Mercredi 27 juillet 2022 de 13h30 à 15h00 HAE

    Ouverture des inscriptions: 11 juillet 2022 à 12 h HAE.

    SSH keys are a great way to secure your connection to a remote server. In this talk, we'll cover the nitty-gritty of setting and using SSH keys on the Alliance's clusters. The main focus will be on Windows 10/11.

    This presentation will take place on July 26 at 10:30 AM to 11:30 AM EDT.

    Registration opens: July 11 at 12:00 PM EDT.


    Research data repositories are publishing platforms for your data sets. BOREALIS (formerly Dataverse) is an excellent repository option for researchers working in Canada who wish to publish their data for open science workflows, or to meet funder requirements. This session will help you understand your data workflow, the importance of documenting it, and the FAIR principles for curating your data with a view towards sharing it with others. We will also address when it is not ok to share your data and what you should do with it instead. We will demo a case study of a bilingual historian who uses transcription from 19th c general store notebooks into excel sheets, and how she has published these tabular data and textual hybrids in Dataverse. This will be meaningful to mixed-methods researchers in the Humanities, Social Sciences, and related disciplines. This session includes:
    • 1.5 hour-long Dataverse instruction module, with worksheet for breakout discussion.

    • Time for group Q&A.

    This workshop will be delivered in 1 session:

    • Session 1: Thursday, July 28, 2022 from 1:30 PM to 3:00 PM EDT 

    Registration opens: July 11, 2022 at 12:00 PM EDT.

    This course will introduce neural network programming concepts, theory and techniques. The class material will begin at an introductory level, intended for those with no experience with neural networks. The programming language will be Python 3.9; experience with Python programming will be assumed.

    This workshop will be delivered in 3 sessions:

    • Session 1: Tuesday, August 2, 2022 from 10:30 AM - 12:00 PM EDT (Monday, August 1 is a holiday.)
    • Session 2: Wednesday, August 3, 2022 from 10:30 AM - 12:00 PM EDT  
    • Session 3: Friday, August 5, 2022 from 10:30 AM - 12:00 PM EDT

    Registration opens: July 18, 2022 at 12:00 PM EDT.

    Learn the basics of shared memory programming with OpenMP. In particular, we will discuss the OpenMP execution and memory model, performance, reductions and load balancing.

    This workshop will be delivered in 3 sessions:

    • Session 1: Tuesday, August 2, 2022 from 1:30 PM - 3:00 PM EDT (Monday, August 1 is a holiday.)
    • Session 2: Wednesday, August 3, 2022 from 1:30 PM - 3:00 PM EDT  
    • Session 3: Friday, August 5, 2022 from 1:30 PM - 3:00 PM EDT

    Registration opens: July 18, 2022 at 12:00 PM EDT.

    NOTE: This presentation has been cancelled in favour of doing a better one during the 2022-2023 academic year.

    Would you like to use containers on our clusters? While Docker cannot be used, such can be done using Apptainer (formerly called Singularity). This presentation will demonstrate using Apptainer containers on our clusters.

    NOTE: This presentation has been cancelled in favour of doing a (better) presentation-course involving Apptainer in the 2022-2023 academic year.

    This is a common area for all 2022 Compute Ontario Summer School course registrants. Everyone that is registered in a summer school course is automatically able to access this common area.