Skip to main content

Using Rsync

Basic Rsync Usage

rsync is a very powerful and widely used tool for moving data. The manual page has many useful examples (from the command line type "man rsync"). Here are a couple of examples to get you started:

# the command format is 
#rsync <switch options>  <data source>  <data destination> 
#
# So the following copies from a local folder to a destination folder on a remote host named bellows.eri.ucsb.edu:
rsync -avr /data/ username@bellows.eri.ucsb.edu:/some/other/folder/

# the -avr switches are: 
# 'a' for archive mode (when in doubt use this)
# 'v' for verbose (rsync will tell you what is going on)
# 'r' for recursive, recurse into directories

One trick to learn with rsync is the difference between leaving the trailing slash on or off.

# this command copies contents of /data/ to the destination directory /some/other/folder/
rsync -avr /data/ username@bellows.eri.ucsb.edu:/some/other/folder/

# ... while this command creates a folder 'data' on the destination and copies all of its contents:
rsync -avr /data username@bellows.eri.ucsb.edu:/some/other/folder/

When in doubt, test with --dry-run, and rsync will tell you what would have happened:

rsync -avr --dry-run /data username@bellows.eri.ucsb.edu:/some/other/folder/

If you are rsyncing from a source with different permissions you can use the following to update permissions while copying:

# Rsync files while preserving timestamps, applying ownership and permissions 
rsync -av --chown="$USER":"$GROUP" --chmod=Du=rwx,Dg=rwx,Do=rx,Fu=rw,Fg=rw,Fo=r "$SRC_DIR""$DEST_DIR" 

be sure to replace the user, group, source, and destination with your user / group and source / destination directories. 

Trick for Long running jobs

Wrap the rsync command in a bash loop:

until rsync ...; do sleep 1; done

This will restart the rsync command if the network connection is lost, and keep trying until the rsync job completes. The sleep prevents the command from overwhelming the CPU during a lost connection. 

Accessing Servers via the Command Line

Access the HPC systems using secure shell with your username and password from a command line. Rsync can leverage this since it runs over ssh

# connect to the bastion host
ssh username@ssh.grit.ucsb.edu

# or go straight to your HPC machine, e.g. bellows.eri.ucsb.edu
ssh username@bellows.eri.ucsb.edu

You'll get tired of typing your password, so use the more secure method of generating keys:

# generate the private and public key pair (leave password fields blank)
ssh-keygen -t rsa -b 4096

# copy the public key to the server
# Only the public key is copied to the server. The private key should never be copied to another machine.
ssh-copy-id -i ~/.ssh/mykey username@ssh.eri.ucsb.edu

# ... Once that is done you should be able to login without typing your password. 

Advanced Config Settings


Advanced usage is to set up a config file in your /home/username/.ssh folder with similar entries (where "username" would be your username)

# Example Config File entries 
Host eri-hpc
Hostname ssh.grit.ucsb.edu
IdentityFile /home/username/.ssh/id_rsa
User username

The above would allow you to ssh with the command:

ssh eri-hpc