Skip to main content

Slurm Usage

Sure! Here is the reformatted text in the CommonMark markdown editor format:

**Quick Introduction**

A queue in Slurm is called a partition. User commands are prefixed with `s`.

**Useful Commands**

- `sacct`, `sbatch`, `sinfo`, `sprio`, `squeue`, `srun`, `sshare`, `sstate`, etc.
- `sbatch`: sends jobs to the Slurm queue
- `sinfo`: general info about Slurm
- `squeue`: inspect queue
- `sinfo -lNe`: more detailed info reporting with long format and nodes listed individually
- `scancel 22`: cancel job 22
- `scontrol show job 2`: show control info on job 2

**Examples:**

```bash
# find the queue names:
[user@computer ~]$ sinfo
PARTITION AVAIL TIMELIMIT NODES STATE NODELIST
basic* up infinite 1 idle

# test a job submission (don't run)
sbatch --test-only slurm_test.sh

# run a job
sbatch slurm_test.sh
```

**Example Slurm job file:**

```bash
#!/bin/bash
## SLURM REQUIRED SETTINGS
#SBATCH --partition=basic
#SBATCH --nodes=1
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=1

## SLURM reads %x as the job name and %j as the job ID
#SBATCH --output=%x-%j.out
#SBATCH --error=%x-%j.err

# Output some basic info with job
pwd; hostname; date;

# requires ED2_HOME env var to be set
cd $ED2_HOME/run

# Job to run
./ed2
```

**Another Example:**

```bash
#!/bin/bash
#
#SBATCH -p basic # partition name (aka queue)
#SBATCH -c 1 # number of cores
#SBATCH --mem 100 # memory pool for all cores
#SBATCH -t 0-2:00 # time (D-HH:MM)
#SBATCH -o slurm.%N.%j.out # STDOUT
#SBATCH -e slurm.%N.%j.err # STDERR

# code or script to run
for i in {1..100000}; do
echo $RANDOM >> SomeRandomNumbers.txt
donesort SomeRandomNumbers.txt
```

**Python Example:**

The output goes to a file in your home directory called `hello-python-*.out`, which should contain a message from Python.

```bash
#!/bin/bash

## SLURM REQUIRED SETTINGS1G
#SBATCH --nodes=1
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=1

## SLURM reads %x as the job name and %j as the job ID
#SBATCH --output=%x-%j.out
#SBATCH --error=%x-%j.err

#SBATCH --job-name=hello-python # create a short name for your job
#SBATCH --time=00:01:00 # total run time limit (HH:MM:SS)

## Example use of Conda:

# first source bashrc (with conda.sh), then conda can be used
source ~/.bashrc

# make sure conda base is activated
conda activate

# Other conda commands go here

## run python
python hello.py
```

`hello.py` should be something like this:

```python
print('Hello from Python!')
```

**Computer Facts:**

Find out facts about the computer for the job file

```bash
# number of cores?
grep 'cpu cores' /proc/cpuinfo | uniq

# memory
[emery@bellows ~]$ free -h