Slurm Batch on ETP

From Etp
Jump to: navigation, search

Slurm Batch on ETP

We have setup the Slurm batch system on the etp/lsschaile desktop cluster, which consists of about 40 nodes and 80 job-slots.


  • Setup (on etp desktop): module load slurm
  • Submitting jobs:
    • sbatch -p lsschaile <job-script> # etp cluster
    • sbatch -p cip <job-script> # cip cluster
    • sbatch -a 1-10 -p lsschaile Job.sh submits job array of, i.e. 10 times same job script, job-array id available via ENV-Variable SLURM_ARRAY_TASK_ID
  • job scripts are basically shell scripts with a few extra lines for Slurm specific setup, here is a simple example script using Root
  • squeue gives information on your running jobs
  • sview is GUI SLurm monitoring (jobs, partitions, etc)
  • scancel <job-id> kills a job

More info you can find in LRZ Slurm doc or Slurm home page.

Each job has an env TEMPDIR set that points to a directory on the local disk. Slurm creates it and cleans up after the job. IO intensive jobs will probably work better by copying inputs here, rather than reading from shared NFS disk. The local disk is typically small, e.g. 80GB SSD, so this is not appropriate for all payloads.

Slurm Batch on CIP (Obsolete)

WARNING: Since there were problems with our jobs running in the CIP pool it is not recommend to use the cip partition, please get in contact with the sysadmins if you want to give it a try, instructions below for reference only
The CIP cluster at Schellingstrasse has also Slurm installed and can be used for job submission. CIP has same environment (Ubuntu, /home and /project dirs, module system).

  • To use Slurm at CIP execute
    • slurm_CIP
  • to switch back to ETP execute
    • slurm_LSSCHAILE

Miscellaneous tips

  • Use
    $TEMPDIR
    as working directory. It points to /scratch-local/<automatic_name>, Slurm creates it and cleans up after the job. IO intensive jobs will probably work better by copying inputs here, rather than reading from shared NFS disk. The local disk is typically small, e.g. 80GB SSD, so this is not appropriate for all payloads.