Having your way with rsync
The rsync command can replicate collections of files from one place to another in every possible detail or it can allow you to control exactly how that replication flows -- what it replicates and what it doesn't.
In its simplest form, the rsync command will copy files from the file source to the file destination. It will not remove files on the destination side that aren't on the source and it won't recreate all of the metadata (e.g., ownership and group details) unless your rsync command includes just the right set of options. So let's follow along behind a series of rsync commands to see just what they do and don't do in response to our synchronization requests.
First, if you specify the name of a directory as a source, rsync is going to create (or update) a directory by that name on the source location as you can see from this example. Here, we're working with two folders on the same system. When we start, both of the orig and copy directories exist, but have only some files in common.
When we run the rsync command with the -v (verbose) option, we can see that the command is copying all of the files from one directory to the other.
Since almost no data was transferred with these small files, the command moves along, but no speedup is observed. Look at the destination directory afterwards and we can see that it now contains all of the files in the orig folder. It also retains the one file on the destination that was there before the rsync operation (and doesn't exist on the source.
In the next example, we are working between two systems and replicating a directory called "archive". The first operation copies everything, creating the archive directory on the remote system. We use just ~unidweeb as the destination on the remote system (home of the user called "unixdweeb").
The next time we run the rsync command, there is nothing to copy. Nothing has changed in the archive directory so nothing is changed on the original. Still we see some bytes going back and forth because the rsync processes still need to compare notes and determine if any files or file content needs to move.
When a new file shows up on the remote archive, we have to adjust our rsync command if we want our two archive directories to remain exactly the same. Our next rsync operation shows that's just what happens. By adding the --delete option, we tell rsync to delete any files on the destination system that don't exist on the source system.
Notice the "deleting archive/newfile" message that appears in our verbose output. If you want your rsync operation to use the local system as the content authority and make sure that the remote copy looks exactly the same as the original one, --delete is the option to use.
Next, we see the bytes transferred statistic and a very modest speedup.
And, as you'd expect, the new file that somehow snuck into the remote system's archive directory is now gone.
In addition, were we to examine the files on both systems, the ownership and permissions would be the same because we are making use of the -a (archive) option that ensure this is the case.
If, instead, we want to maintain whichever files in the two archive directories are the newest, there are options for that as well.
The --existing option tells rsync to only update files that already exist on the remote system and not to create new files. Notice in the example below that newfile2 is not replicated.
We might also want to tell rsync not to touch files that are newer on the destination side. To demonstrate how this works, we first create a new file called "newfile" on the destination server.
And then, on the local system, we run rsync again and notice that newfile on the destination server is not overwritten by the file by the same name on the source system.
To ensure that we don't create new files and don't overwrite files that are newer on the destination side, we can combine these two options. Notice in the output shown below that no changes were made.
We can also exclude portions of a directory that you don't want copied from one system (or file system location) to another by using the --exclude option. An example of --exclude is shown in the command below.
We can exclude multiple directories with just a little more effort. The command below excludes both the junk directory and one called "notes". The paths are relative to the current directory.
Because we included the /* after each of the directories to be excluded, the directories themselves are replicated, but not their content.
Whenever you're struggling to get the syntax right on an rsync command that is at all complex, remember that you can try out the command without actually making any changes by using the --dry-run option along with -v (verbose). This will show you what rsync would be doing if you ran the command for real, but won't actually replicate anything.