Slurm Usage
Sure! Here is the reformatted text in the CommonMark markdown editor format:
**Quick IntroductionIntroduction**
A queue in Slurm is called a partition. User commands are prefixed with '''s'''`s`.
**Useful CommandsCommands**
*- sacct,`sacct`, sbatch,`sbatch`, sinfo,`sinfo`, sprio,`sprio`, squeue,`squeue`, srun,`srun`, sshare,`sshare`, sstate`sstate`, etc...sbatch- #`sbatch`: sends jobs to the slurmSlurm queuesinfo- #`sinfo`: general info about slurmSlurmsqueue- #`squeue`: inspect queue
- `sinfo -lNe #lNe`: more detailed info reporting with long format and nodes listed individually
- `scancel 22 #22`: cancel job 22
- `scontrol show job 2 #2`: show control info on job 2
**Examples:<pre>**
```bash
# find the quequeue names:
[user@computer ~]$ sinfo
PARTITION AVAIL TIMELIMIT NODES STATE NODELIST
basic* up infinite 1 idle
# test a job submission (don't run)
sbatch --test-only slurm_test.sh
# run a job
sbatch slurm_test.sh
```
</pre>
=== **Example Slurm job filefile:**
```bash
#!/bin/bash
## SLURM REQUIRED SETTINGS
#SBATCH --partition=basic
#SBATCH --nodes=1
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=1
## SLURM reads %x as the job name and %j as the job ID
#SBATCH --output=%x-%j.out
#SBATCH --error=%x-%j.err
# Output some basic info with job
pwd; hostname; date;
# requires ED2_HOME env var to be set
cd $ED2_HOME/run
# Job to run
./ed2</pre>```
**Another Example:<pre>**
```bash
#!/bin/bash
#
#SBATCH -p basic # partition name (aka queue)
#SBATCH -c 1 # number of cores
#SBATCH --mem 100 # memory pool for all cores
#SBATCH -t 0-2:00 # time (D-HH:MM)
#SBATCH -o slurm.%N.%j.out # STDOUT
#SBATCH -e slurm.%N.%j.err # STDERR
# code or script to run
for i in {1..100000}; do
echo $RANDOM >> SomeRandomNumbers.txt
donesort SomeRandomNumbers.txt
```
</pre>**Python Example:**
====Python Example====
The output goes to a file in your home directory called `hello-python-*.out,out`, which should contain a message from python.<pre>Python.
```bash
#!/bin/bash
## SLURM REQUIRED SETTINGS1G
#SBATCH --nodes=1
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=1
## SLURM reads %x as the job name and %j as the job ID
#SBATCH --output=%x-%j.out
#SBATCH --error=%x-%j.err
#SBATCH --job-name=hello-python # create a short name for your job
#SBATCH --time=00:01:00 # total run time limit (HH:MM:SS)
## Example use of Conda:
# first source bashrc (with conda.sh), then conda can be used
source ~/.bashrc
# make sure conda base is activated
conda activate
# Other conda commands go here
## run python
python hello.py
```
</pre>
`hello.pypy` should be something like this:<pre>
```python
print('Hello from python!Python!')</pre>```
=== **Computer FactsFacts:**
Find out facts about the computer for the job file
<pre>```bash
# number of cores?
grep 'cpu cores' /proc/cpuinfo | uniq
# memory
[emery@bellows ~]$ free -htotal used free shared buff/cache availableMem: 1.5Ti 780Gi 721Gi 1.5Gi 8.6Gi 721GiSwap: 31Gi 0B 31Gi
</pre>
=== nodes vs tasks vs cpus vs cores ===Here's a very good writeup: https://researchcomputing.princeton.edu/support/knowledge-base/scaling-analysis. For most of our use cases, one node and one task is all that is needed (More than this requires special code such as mpi4py (MPI = Message Passing Interface).<pre>#SBATCH --nodes=1#SBATCH --ntasks=1#SBATCH --cpus-per-task=N</pre>is the correct way to request N cores for a job. Just replace N in that config with the number of cores you need
To get the max value for N for a computer:<pre>scontrol show node | grep CPU</pre>produces 'CPUTot'
Quoting directly from: https://login.scg.stanford.edu/faqs/cores/Also useful: https://stackoverflow.com/questions/65603381/slurm-nodes-tasks-cores-and-cpus
=== See Also ===https://www.carc.usc.edu/user-information/user-guides/hpc-basics/slurm-templates
https://docs.rc.fas.harvard.edu/kb/convenient-slurm-commands/
https://csc.cnsi.ucsb.edu/docs/slurm-job-scheduler
Python:https://rcpedia.stanford.edu/topicGuides/jobArrayPythonExample.html