GPU Resources and Access

NVIDIA supports the Covid CXR Hackathon by providing access to a POC system equipped with 8 GPUs. The system has the following specifications:

  • CPU: Dual proc AMD EPYC 7742 64-Core
  • Memory: 256 GB
  • OS: Ubuntu 20.04.3 LTS
  • NVIDIA Driver Version: 460.73.01
  • CUDA Version: 11.2

Available GPUs are listed in the table below.

GPU # GPU GPU memory
0 Ampere A40 48GB GDDR6
1 Ampere A40 48GB GDDR6
2 Ampere A30 24GB HBM2
3 Volta V100 32GB HBM2
4 Ampere A100 40GB HBM2
5 Ampere A100 40GB HBM2
6 Ampere A10 24GB GDDR6
7 Turing T4 16GB GDDR6

In order to get access to the system please contact us on slack.

Once you have your credentials, you shall be able to ssh to the login node using ssh your_login@nvpoc.ddnsfree.com you will land into a login node connected to the 8 GPUs server that you can use with slurm. Slurm is installed with enroot+pyxis so you can use containers+slurm. The server is shared but is managed by slurm, so in other words whenever you can queue a job you are welcome to use it.

NFS shared folders are /users /sw /data and /scratch. All are persistent. Please do not leave any sensitive data after using the server.

To launch an interactive session on the server from the login node srun --gpus N --pty bash -i where N in the number of GPUs requested.

Please refer to SLURM documentation for additional options.

Additional resources

Medical Imaging augmented with AI|NVIDIA
MONAI a freely available, community-supported, PyTorch-based framework for deep learning in healthcare imaging. It provides domain-optimized foundational capabilities for developing healthcare imaging training workflows in a native PyTorch paradigm.

NVIDIA self paced online courses: