Skip to main content

HPC Basics

Introduction

Here are the basics for new users of the High Performance Computing (HPC) offered by GRIT, including the basic tools needed to efficiently access and use the HPC systems and the software they run.

The HPC Systems are (their operating system, RAM and # for slurm line with 'cpus per task'):

hammer: Fedora 34, 125 GB RAM, 44 cpus

anvil: Centos 7, 500 GB RAM, 30 cpus

forge: Fedora 34, 500 GB RAM, 24 cpus

tong: Fedora 34, 251 GB RAM, 28 cpus

bellows: Centos 8, 1.5 TB RAM, 96 cpus

Hammer is the oldest, Bellows is the newest. (Append ".eri.ucsb.edu" to the name if the full domain name is needed, e.g. hammer.eri.ucsb.edu). All should have R and Python installed.

Typically we set up some local scratch space for you to store your stuff.

 
#Info obtained from each system with: 
scontrol show node | grep CPU 

Access 

Access the HPC systems using secure shell with your username and password from a command line

# connect to the bastion host
ssh username@ssh.eri.ucsb.edu

# or go straight to your HPC machine, e.g. bellows.eri.ucsb.edu
ssh username@bellows.eri.ucsb.edu

You'll get tired of typing your password, so use the more secure method of generating keys:

# generate the private and public key pair (leave password fields blank)
ssh-keygen -t rsa -b 4096

# copy the public key to the server
# Only the public key is copied to the server. The private key should never be copied to another machine.
ssh-copy-id -i ~/.ssh/mykey username@ssh.eri.ucsb.edu

# ... Once that is done you should be able to login without typing your password. 

Advanced usage is to set up a config file in your /home/username/.ssh folder with similar entries (where "username" would be your username)

# Example Config File entries 
Host eri-hpc
Hostname ssh.eri.ucsb.edu
IdentityFile /home/username/.ssh/id_rsa
User username

The above would allow you to ssh with the command:

ssh eri-hpc

Moving Data

 rsync is a very powerful and widely used tool for moving data. The manual page has many useful examples (from the command line type "man rsync"). Here are a couple of examples to get you started:

# the command format is 
#rsync      
#
# So the following copies from a local folder to a destination folder on a remote host named bellows.eri.ucsb.edu:
rsync -avr /data/ username@bellows.eri.ucsb.edu:/some/other/folder/

# the -avr switches are: 
# 'a' for archive mode (when in doubt use this)
# 'v' for verbose (rsync will tell you what is going on)
# 'r' for recursive, recurse into directories

One trick to learn with rsync is the difference between leaving the trailing slash on or off.

# this command copies contents of /data/ to the destination directory /some/other/folder/
rsync -avr /data/ username@bellows.eri.ucsb.edu:/some/other/folder/

# ... while this command creates a folder 'data' on the destination and copies all of its contents:
rsync -avr /data username@bellows.eri.ucsb.edu:/some/other/folder/

When in doubt, test with --dry-run, and rsync will tell you what would have happened:

rsync -avr --dry-run /data username@bellows.eri.ucsb.edu:/some/other/folder/


Running Code

To run your code and use the HPC machine in a fair and efficient way, you'll use a queuing system. See: https://bookstack.grit.ucsb.edu/books/hpc-usage/page/slurm-usage

A few key slurm commands are:

# submit a job (where slurm_test.sh is a shell script for invoking a program)
sbatch slurm_test.sh

# show the whole queue
squeue -a

# look at a job's details
scontrol show job 

screen can be used when running jobs to allow you to disconnect your computer from a remote terminal session, for example when running a very long rsync job. See: http://aperiodic.net/screen/quick_reference

# (From a terminal command line):
# create a new screen session
screen -S    #name the session

# from within a screen session
ctrl-a d   # detatch from session
exit       # quits and exits a screen session

# look for running screen sessions
screen -ls
screen -r    #attach to named session

# other commands
attach to a running session 	screen -x
the “ultimate attach” 	screen -dRR (Attaches to a screen session. If the session is attached elsewhere, detaches that other display. If no session exists, creates one. If multiple sessions exist, uses the first one.) 

Other Notes

All the HPC systems are built on the ZFS file system. To see information, for example e.g. about how much space is available:

[username@hpcsystem ~]$ zfs list
NAME                   USED  AVAIL     REFER  MOUNTPOINT
sandbox1              21.3T  4.00T       96K  /mnt/sandbox1
sandbox2              21.3T  4.00T       96K  /mnt/sandbox2
sandbox3              21.3T  4.00T     21.1T  /mnt/sandbox3

... this system has 4 Terabytes available for storage.

# Commands for retrieving the system specifications (OS, RAM, Cores):
cat /etc/redhat-release 
free -h

# get number of cpu's for slurm
scontrol show node | grep CPU

# old way:
grep 'processor' /proc/cpuinfo | uniq