CLI Usage

Getting on the cluster

SSH: ssh hpc.grit.ucsb.edu
You’ll land on a compute node inside a Slurm-backed interactive session (the login host forwards you automatically).
File transfers: scp, rsync, sftp, etc, still work as usual to hpc.grit.ucsb.edu.

Partitions you can use

grit_nodes (default) – general use; includes nodes hpc-01/02/03.
Other partitions exist but are group-restricted

Resource basics (what Slurm expects)

CPUs: -c <cores> per task, or -n <ntasks> total tasks.
Memory:
- --mem=<MB|GB> = per-node memory, or
- --mem-per-cpu=<MB|GB> = per allocated CPU.
Time: -t D-HH:MM:SS (set this realistically; backfill favors shorter jobs).
Partition: -p grit_nodes (default).

On this cluster the default memory per CPU is 4 GB if you don’t specify otherwise.

See what’s available / what you’re running

sinfo -p grit_nodes -Nel          # nodes, CPUs, memory, state
squeue -u $USER                   # your jobs
squeue --start                    # scheduler’s predicted start times

One-off noninteractive command

srun -p grit_nodes -c 2 --mem=8G -t 30:00 myprog --arg foo

Batch jobs (recommended for longer runs)

Create a script job.sh:

#!/bin/bash
#SBATCH -p grit_nodes
#SBATCH -c 8
#SBATCH --mem=64G
#SBATCH -t 12:00:00
#SBATCH -J myjob
#SBATCH -o slurm-%j.out

module load mytool   # if you use environment modules
python train.py --epochs 10

Submit + check:

sbatch job.sh
squeue -u $USER
tail -f slurm-<jobid>.out

Job arrays (many similar runs)

sbatch --array=0-99 -p grit_nodes -c 2 --mem=8G -t 1:00:00 job.sh

Inside job.sh use $SLURM_ARRAY_TASK_ID to index your inputs.

Cancel / modify

scancel <jobid>                     # cancel one
scancel -u $USER                    # cancel all yours
scontrol update JobId=<jobid> TimeLimit=02:00:00   # shorten time limit

Accounting & live stats

sacct -j <jobid> --format=JobID,JobName,State,Elapsed,Timelimit,MaxRSS,ReqMem,AllocCPUS
sstat -j <jobid>.batch --format=AveCPU,AveRSS,MaxRSS,MaxVMSize,TaskCPU

Common “why is my job pending?” reasons

(Resources): not enough free CPUs or memory right now. Try shorter -t, fewer CPUs, or less --mem.
(BeginTime): Slurm reserved a future start window for your job. Lower -t or resources to start sooner, or run squeue --start to see the ETA.
Constraints or node eligibility: very large per-node requests (CPUs or --mem) may only fit on the biggest nodes, which can lengthen wait time.

Good citizenship / performance tips

Prefer multiple smaller tasks over one huge single-node grab when you can.
Keep single-node requests well under a node’s total RAM/cores unless you truly need them.
Set realistic time limits; the backfill scheduler starts shorter jobs sooner.

Examples you can paste

# 1) CPU-only batch job with array:
sbatch -p grit_nodes -c 2 --mem=8G -t 2:00:00 --array=1-50 run_sim.sh

# 2) Memory-per-CPU style (8 cpus × 6 GB each = 48 GB/node):
srun -p grit_nodes -c 8 --mem-per-cpu=6G -t 1:00:00 --pty bash -l

# 3) Check predicted start times:
squeue --start

FAQ for this cluster

Do I need to salloc first? No. SSH gives you a Slurm-backed shell. Use srun for bigger interactive bursts, or sbatch for long runs.
VS Code / PyCharm remote? Not supported on the login host; use terminal + Slurm (srun/sbatch) instead.
Which partition do I use? grit_nodes unless you were explicitly added to a project-specific partition.

If you paste a specific job command you’re planning to run, I’ll check it against the node sizes here and suggest the best flags (CPUs/--mem/-t).

Jupyter Notebook

R Studio

JupyterNotebook Custom kernel

IDL Aliases

Troubleshooting

CLI Usage

Linux Desktop

VS Code Server

ENVI

FAQ

Scratch Space

CLI Usage

Getting on the cluster

Partitions you can use

Resource basics (what Slurm expects)

See what’s available / what you’re running

One-off noninteractive command

Batch jobs (recommended for longer runs)

Job arrays (many similar runs)

Cancel / modify

Accounting & live stats

Common “why is my job pending?” reasons

Good citizenship / performance tips

Examples you can paste

FAQ for this cluster