QBiG GPU Cluster

The QBiG GPU cluster is funded by the DFG in the framework of the CRC 110. It consists of three parts. The most recent addition QBiG-III consists of 2 nodes with 8 NVIDIA A100 GPUs. QBiG-II added 5 nodes with 8 NVIDIA P100 cards each over QBiG-I. QBiG-I had a peak performance of about 180 TFlops in double and about 373 TFlops in single precision. QBiG-I has a peak performance of 56 TFlops in double and 168 TFlops in single precision on 48 K20m GPUs.

The fast infiniband network allows the users for multi GPU and multi-node programme execution. QBiG is connected to 190 TByte of RAID disk storage using a Lustre filesystem.

Configuration QBiG-III (lnode18-lnode19)

2 nodes with 8 NVIDIA A100 GPUs each

Configuration QBiG-II (lnode13-lnode17)

5 nodes with 8 NVIDIA P100 GPUs each
2 Intel XEON CPUs with 14 cores plus hyperthreading each per node
768 GB of main memory per node
Switched Infiniband Network

Configuration QBiG-I (lnode01-lnode12)

12 nodes with 4 NVIDIA K20m GPUs each
2 Intel XEON CPUs with 4 cores each per node
64 GByte main memory per GPU node
1 node with 32 CPU cores (Intel(R) Xeon(R) CPU E5-2650 0 @ 2.00GHz) with 128 GByte main memory
2 TByte scratch disk space per node
Switched Infiniband Network

CPU only nodes (lcpunode01 and lcpunode01)

In addition we provide some CPU only nodes

Access and Environment

The cluster can be accessed via the frontend node ‘qbig.cluster.hiskp’ from within the HISKP VPN network only. Connect using ssh to ‘qbig.itkp.uni-bonn.de’. Every user has a directory on the frontend node in ‘/hiskp4/username’. The latter FS is a lustre FS available via infiniband on all compute nodes.

Please note that the frontend node is for compiling and development only, so please do not run production jobs on qbig interactively. A few CPU slots are available on the frontend node, which can be used.

There are two MPI libraries installed, openMPI and MVAPICH2. Both can handle Infiniband, but only the latter is compiled with GPU direct support. However, only with openMPI I managed to get hybrid MPI+openMP jobs running. MVAPICH2 is the standard you will get when invoking mpicc and mpirun. If you want to use openMPI you need to use mpicc.openmpi and mpirun.openmpi instead. Unfortunately, currently the man pages refer to MVAPICH2 only.

This means in particular that you need to recompile your application for either openMPI or MVAPICH2. Therefore, you have to compile with mpicc.openmpi if you want to run a hybrid MPI+openMP application.

Batch Queuing

Batch queuing is done using SLURM. Most important commands are sbatch for submitting a job, squeue for listing jobs in the queue and scancel for cancelling a job.

Maximal walltime allowed is currently 36 hours. Default is one hour.

The default memory requirement is set to one GB. Please specify the momory requirements as precisely as possible! The limit is checked strictly and jobs will be aborted.

Example Job Script

Single Node Job

#!/bin/bash -x
#SBATCH --job-name=my-job
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=1
#SBATCH --cpus-per-task=1
#SBATCH --output=%x.%J.out
#SBATCH --error=%x.%J.out
#SBATCH --time=36:00:00
#SBATCH --mail-user=me@hiskp.uni-bonn.de
#SBATCH --mail-type=ALL
#SBATCH --mem=1500M

export OMP_NUM_THREADS=${SLURM_CPUS_PER_TASK}
export KMP_AFFINITY=balanced,granularity=fine,verbose

cd /hiskp4/username/run-dir/
srun path-to-exec/executable

cd -

Single Node job with GPUs

#!/bin/bash -x
#SBATCH --job-name=gpujob
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=1
#SBATCH --cpus-per-task=4
#SBATCH --output=%x.%J.out
#SBATCH --error=%x.%J.out
#SBATCH --time=01:00:00
#SBATCH --mail-user=me@hiskp.uni-bonn.de
#SBATCH --mail-type=ALL
#SBATCH --gres=gpu:4
#SBATCH --mem=1G

export OMP_NUM_THREADS=${SLURM_CPUS_PER_TASK}
export KMP_AFFINITY=balanced,granularity=fine,verbose

cd /hiskp4/username/run-dir/
srun path-to-exec/executable

cd -