Slurm Usage
Intro
SLURM (Simple Linux Utility for Resource Management) is a widely used open-source job scheduler that we use on GRIT HPC systems to allocate resources efficiently. This guide providies some basic information on how to use Slurm and create scripts for submitting jobs to the Slurm queue.
Typical Workflow
- Develop your program (e.g. on your computer and a subset of data)
- Update your program for use on HPC (e.g. change data paths if needed, etc.)
- Create a slurm job file (see below)
- Submit your job to the queue
- Monitor the job status, wait for completion
Steps 3-5 are detailed below.
Example Slurm job files
Slurm job files are writting in bash, which is a linux shell scripting language. Here's an example which uses one cpu on one computer to run a simple job, outputting any errors or other outputs to log files in the same directory. Note that on most GRIT HPC systems the main queue (aka partition in Slurm) is called 'basic'.
#!/bin/bash
## SLURM REQUIRED SETTINGS <--- two hashtags are a comment in Slurm
#SBATCH --partition=basic
#SBATCH --nodes=1
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=1
## SLURM reads %x as the job name and %j as the job ID
#SBATCH --output=%x-%j.out
#SBATCH --error=%x-%j.err
# Job to run
./my_example_code.bash
Another Example:
#!/bin/bash
#
#SBATCH -p basic # partition name (aka queue)
#SBATCH -c 1 # number of cores
#SBATCH --mem 100 # memory pool for all cores
#SBATCH -t 0-2:00 # time (D-HH:MM)
#SBATCH -o slurm.%N.%j.out # STDOUT
#SBATCH -e slurm.%N.%j.err # STDERR
# code or script to run
for i in {1..100000}; do
echo $RANDOM >> SomeRandomNumbers.txt
donesort SomeRandomNumbers.txt
Python Example with Conda
The output goes to a file in your home directory called hello-python-*.out, which should contain a message from python.
#!/bin/bash
## SLURM REQUIRED SETTINGS1G
#SBATCH --nodes=1
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=1
## SLURM reads %x as the job name and %j as the job ID
#SBATCH --output=%x-%j.out
#SBATCH --error=%x-%j.err
#SBATCH --job-name=hello-python # create a short name for your job
#SBATCH --time=00:01:00 # total run time limit (HH:MM:SS)
## Example use of Conda:
# first source bashrc (with conda.sh), then conda can be used
source ~/.bashrc
# make sure conda base is activated
conda activate
# Other conda commands go here
## run python
python hello.py
hello.py should be something like this:
print('Hello from python!')
Adding details to Slurm job files
These examples are all very simple, so here are some useful commands for adding more complexity, such as more memory, more CPU's etc. Adding these requires finding out facts about the computer for the job file:
Find the number of CPU cores on a computer from the command line:
[user@computer ~]$ grep 'cpu cores' /proc/cpuinfo | uniq
cpu cores : 48 <---- an example output
Find out how much memory a computer has:
[user@computer ~]$ free -h
total used free shared buff/cache available
Mem: 1.5Ti 780Gi 721Gi 1.5Gi 8.6Gi 721Gi
Swap: 31Gi 0B 31Gi
For most of our use cases, one node and one task is all that is needed. More than this requires special code such as mpi4py (MPI = Message Passing Interface), or the Parallel computing toolbox such as with MATLAB (which uses --cpus-per-task). To request N cores for a job, just replace N with the number of cores you need in the Slurm job file, such as:
#SBATCH --nodes=1
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=N
To get the max value for N for a computer:
[user@computer ~]$ scontrol show node | grep CPU
CPUAlloc=20 CPUTot=95 CPULoad=1.00
'CPUTot' is the max value for N.
Find the queue names:
[user@computer ~]$ sinfo
PARTITION AVAIL TIMELIMIT NODES STATE NODELIST
basic* up infinite 1 idle <--- in this case the queue name is 'basic' and it's the default
as indicated by the *
Submitting your job to the queue
Assuming you have a Slurm job file named slurm_test.sh:
# test a job submission (don't run)
[user@computer ~]$ sbatch --test-only slurm_test.sh
# run a job
[user@computer ~]$ sbatch slurm_test.sh
Monitoring your job
Examples of how this is done:
[user@computer ~]$ squeue
JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)
166626 basic my_code username PD 0:00 1 (Resources)
166627 basic my_code username R 3:04 1 anvil
In the above, 'R' denotes that the job is running, 'PD' denotes that Slurm is waiting for resources.
You can also monitor the output by watching the log files from the command line. This will show the last few lines of the log file and update as the log file changes:
[user@computer ~]$ tail -f log-file-name.txt
Cancel the job if needed:
[user@computer ~]$ scancel 22 # cancel job 22
You can get the job number from squeue (e.g. JOBID).
Useful Commands
sinfo # general info about slurm
sinfo -lNe # more detailed info reporting with long format and nodes listed individually
scontrol show job 2 # show control info on job 2
To find the number of cores per socket:
lscpu | grep "Core(s) per socket" | awk '{print $4}'
More Example Job scripts
An example with R
#!/bin/bash -l
# How long should I job run for
#SBATCH --time=01:00:00
# Number of CPU cores, in this case 1 core
#SBATCH --ntasks=1
# Number of compute nodes to use
#SBATCH --nodes=1
# Name of the output files to be created. If not specified the outputs will be joined
#SBATCH --output=%x.%j.out
#SBATCH --error=%x.%j.err
# The code you want to run in your job
Rscript test_forge_r.R
Here's what was used in the test script test_forge_r.R:
# A simple R script to print hello world!
aString = "Hello World!"
print (aString)
References
https://www.carc.usc.edu/user-information/user-guides/hpc-basics/slurm-templates
https://docs.rc.fas.harvard.edu/kb/convenient-slurm-commands/
https://csc.cnsi.ucsb.edu/docs/slurm-job-scheduler
Python: https://rcpedia.stanford.edu/topicGuides/jobArrayPythonExample.html
https://login.scg.stanford.edu/faqs/cores/
https://stackoverflow.com/questions/65603381/slurm-nodes-tasks-cores-and-cpus
Regarding nodes vs tasks vs cpus vs cores: Here's a very good writeup: https://researchcomputing.princeton.edu/support/knowledge-base/scaling-analysis.