Schedule
10 June 2024 (09-13)
18 June 2024 (09-13)
24 June 2024 (09-13)
Course plan
Lesson 1: Introduction (4 hours)
- High performance computing (HPC): what is it?
- Brief introduction
- Available resources at Unibs
- Nvidia DGX equipped with 8 A100 80GB
- Dell Powerscale: a 270TB storage that feeds the DGX
- Nvidia Gpus and MIG instances
- Paas (platform as a service): docker and singularity containers
- What is it
- Reasons to use containers: reproducibility, portability, sandboxing
- Pro/cons and why should you use it
Lesson 2: Containers (4 hours)
- How to: from 0 to hero
- Basic docker commands
- How to build a docker image
- Advance docker commands: network, compute resources, and volumes
- Popular Container Registries: docker hub, Nvidia gpu cloud, singularity cloud hub
- Portainer: a docker manager
Lesson 3: DL HPC researcher toolbox (4 hours)
- Experiment tracking: WandB, MLFlow, Tensorboard
- Experiment report: WandB report, Notion
- Hyperparameter selection
- Grid vs Random vs Bayesian search
- Hyperparameter optimization framework: W&B sweep
- Sync/Async training and parallelization
- Bottlenecks analysis