Update 2024-04-28:

The resulting script from this blog post can be found on my git page: rsync_encrypted_backup

Backups have been somewhat of a pain for me for quite a while, as I could never find a suitable, easy to manage and easy to recover option for my private computer.

My goal was to create a simple off-site backup routine (i.e. “the cloud”), which would be easy to recover from, suitably fast - ideally with atomic/delta updates - and reasonably secure, i.e. strong default encryption like AES256-level.

I tried several options like 7z with encryption and low (or no) compression rate, sending a whole ZIP archive to a remote storage or even updating existing archives. However, this of course turned out to be rather cumbersome, prone to write-errors / connection issues and extremely slow.

The next approach did work reasonably well, and is what I want to present here. I am sure there is still room for improvement, so if you have any suggestions, feel free to send me a DM in the Fediverse or an e-mail.

The well-known rsync tool is a natural candidate for atomic backups in the Linux-world. It can sync directories with all sort of remote end-points, including (S)FTP, WebDAV, etc. It keeps ACLs, modes and ownership of files intact and is relatively fast, light on system resources and can do syncing both ways (i.e. it may also be used to restore your files). However, rsync does not support encryption while syncing your files.

So in order to encrypt backups within rsync, they have to be encrypted before transmission, ideally in real-time and without impacting read-speed all that much.

File Encryption

A close to perfect solution for this task is gocryptfs, the spiritual successor of encryptfs. It is an encrypted overlay-file-system, that (crucially) supports “reverse-mode”, is extremely fast and utilizes strong encryption methods.

What this means exactly in the context of backups is, that we can mount the directory we want to back up (e.g. our home-directory) in an encrypted, real-time updated form, and sync the encrypted versions of all files rather than the original unencrypted versions. The aforementioned “reverse-mode” is useful, because it mounts a pre-existing, unencrypted directory as encrypted volume.

So first, let’s start with creating a setup for the encrypted file-system. This has to be done only once and creates the metadata and encryption heads for the volume. Once this is done, we only need to mount the encrypted volume in the future:

1
2
3
4
5
6
7
#!/usr/bin/env bash

gocryptfs \
    --init \ # initilise the volume
    --reverse \ # use "reverse mode"
    --plaintextnames \ # do not obfuscate names of files and directories
    "$HOME" # target directory. Here: our home-folder

This process will ask you to set an encryption passphrase as well as provide you with a master restore key. BACK THIS KEY UP SOMEWHERE SAFE AND IN SEVERAL PLACES, BOTH DIGITALLY AS WELL AS PHYSICALLY!

The metadata file will be stored in your unencrypted directory as .gocryptfs.reverse.conf and in the encrypted storage as gocryptfs.conf (unencrypted). Make sure to store this somewhere secure too, as it is required to decrypt the storage in case you need to restore your backups.

From now on, we may mount the directory in its encrypted from:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
#!/usr/bin/env bash

# create temporary directory as encrypted mountpoint
ENCRYPTED_DIR=$(mktemp -d)

# mount home-directory to newly created mountpoint
gocryptfs \
    --ro \ # mount in read-only mode
    --reverse \ # use "reverse mode"
    "$HOME" \ # unencrypted directory to be backed up
    "$ENCRYPTED_DIR" && # temporary mount point
    echo "$HOME has been mounted in encrypted form to $ENCRYPTED_DIR" ||
    echo "Mounting $HOME to $ENCRYPTED_DIR failed!"

In your encrypted directory, you will now find your entire home-directory in encrypted form. The reason we used --plaintextnames before was, that it makes the recovery process a lot easier, if you can actually identify the files and folders from their names (ofc. their contents are encrypted). If you do not need this feature, because you’d recover the entire directory, rather than only partials of it, you may consider removing that parameter when creating the volume.

The --ro parameter sets read-only permissions for the encrypted mount, meaning that you can not write new files to the encrypted volume. Importantly, writing to the unencrypted directory is still possible. Doing so will also update the encrypted directory in real-time. The parameter may protect our directory from technical or user mistakes, however, i.e. if we by accident use the reverse order of target and source in rsync

If we want to recover a backup later-on, of course we do need to write permissions in the encrypted volume. This is mentioned later in this blog post again.

File Transmission

Next, we can finally back up our encrypted directory via rsync. Let’s first talk about the parameters that might be useful for backups. Personally, I want to exclude several directories in the backup, like the “Downloads”, “.cache” and similar folders. rsync can even use wild-cards here, so you can exclude every .git folder or specific file-types (if they have the appropriate file-ending). This is of course not possible (or a lot harder…) if you skipped the --plaintextnames when creating the encrypted volume, as all file- and directory names are obfuscated without it.

We want to keep file-permissions and file-owners, so the --archive parameter is handy here. Since we want to see what is happening during the procedure, the --verbose and --progress parameters are useful as well. Additionally, files that we have deleted from our system should also disappear from the backup, next time we sync them up. Ideally, this should happen after new files are transferred via the --delete-delay.

Because I back up multiple devices to the same network storage, it is a good idea to name the target folder after the hostname of the device. Furthermore, even though I do atomic backups, I want to keep several versions of my backup-files. So I back up to different folders on the remote storage solutions, based on time. To be more exact, I append the current month to the target directory’s name, such that I always have the past 12 versions of my backups (considering that I run backups once every month.): ${HOSTNAME}_$(date +%m)/.

However, if you want to keep fewer past versions, there is a little trick via the modulo of the current month. Say, you want to keep only the past 3 versions, you can do $(($(date +%m) % 3)), which will divide the number of the current month (i.e. 5 for may) by 3 and give you the remainder of 2. So over the course of a year, this calculation would give you 1 in January, 2 in February, 0 in March, 1 in April, 2 in May, 0 in June and so on. This in turn means that you’ll always keep the past 3 months as different versions of your backup. Adjust this value to your needs and the size of your remote storage.

The whole transfer procedure looks like this:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
#!/usr/bin/env bash

# How many versions should be kept?
TARGET_VERSION=$(($(date +%m) % 3))

# define target backup storage (here: remote SFTP storage)
TARGET="user@my.remote.backup:${HOSTNAME}_${TARGET_VERSION}/"

# define directories to be excluded
EXCL="--exclude=Downloads/* --exclude=.cache/* --exclude=.var/* --exclude=.local/share/Trash/* --exclude=*.git/* --exclude=.davfs2/*"

# back up directory
rsync \
    --archive \ # keep ownership information in tact
    --verbose \ # print more information during transmission
    --progress \ # show progress of the transmission
    --delete-delay \ # remove deleted files from target after sync
    ${EXCL} \ # set exclusion parameters from above
    "$ENCRYPTED_DIR" \ # source (encrypted mount volume)
    "$TARGET" \ # remote target (see above)

Final touches

For an easy, semi-automated backup routine, a few additional ease of life improvements come in handy, such as mounting the reverse filesystem before backup and unmounting them afterward.

Additionally, I like to send desktop notifications whenever I am using the script in a desktop environment. In order to detect this, I use the $DISPLAY environment variable for X11 desktops and the $WAYLAND_DISPLAY variable for Wayland environments. I typically use gdbus to send notifications, wrapped in a shell-function:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
#!/usr/bin/env bash

# notification function
## Args:
## 1. Headline
## 2. Notification ID (0 for new)
## 3. icon-name
## 4. Notification text
## 5. additional information ("[]" for none)
## 6. timeout in ms
function send_notify {
    gdbus call --session \
        --dest=org.freedesktop.Notifications \
        --object-path=/org/freedesktop/Notifications \
        --method=org.freedesktop.Notifications.Notify \
        "$1" $2 "$3" "$1" "$4" "$5" \
        '{"category": <"im.received">}' $6
}

All in all the final script looks like this:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
#!/usr/bin/env bash

# configs ------------------------------

# How many versions should be kept?
TARGET_VERSION=$(($(date +%m) % 3))

# define target backup storage (here: remote SFTP storage)
TARGET="user@my.remote.backup:${HOSTNAME}_${TARGET_VERSION}/"

# define directories to be excluded
EXCL="--exclude=Downloads/* --exclude=.cache/* --exclude=.var/* --exclude=.local/share/Trash/* --exclude=*.git/* --exclude=.davfs2/*"

# functions ----------------------------

# notification function
## Args:
## 1. Headline
## 2. Notification ID (0 for new)
## 3. icon-name
## 4. Notification text
## 5. additional information ("[]" for none)
## 6. timeout in ms
function send_notify {
    gdbus call --session \
        --dest=org.freedesktop.Notifications \
        --object-path=/org/freedesktop/Notifications \
        --method=org.freedesktop.Notifications.Notify \
        "$1" $2 "$3" "$1" "$4" "$5" \
        '{"category": <"im.received">}' $6
}

# mounting encrypted filesystem --------
echo "[INFO] Attempting to mount source as encrypted dir." 

# create temporary directory as encrypted mountpoint
ENCRYPTED_DIR=$(mktemp -d)

# mount home-directory to newly created mountpoint
gocryptfs \
    --ro \ # mount in read-only mode
    --reverse \ # use "reverse mode"
    "$HOME" \ # unencrypted directory to be backed up
    "$ENCRYPTED_DIR" && # temporary mount point
    echo "$HOME has been mounted in encrypted form to $ENCRYPTED_DIR" ||
    echo "Mounting $HOME to $ENCRYPTED_DIR failed!"

# Transfer data ------------------------

## send notification
if [[ ! -z $WAYLAND_DISPLAY ]] || [[ ! -z $DISPLAY ]]; then
    send_notify \
        "BackUpr" \
        0 \
        "document-send" \
        "Starting backup procedure to $TARGET" \
        "[]" \
        3000
else
    echo "[INFO] Starting backup procedure to $TARGET"
fi

# back up directory
rsync \
    --archive \ # keep ownership information in tact
    --verbose \ # print more information during transmission
    --progress \ # show progress of the transmission
    --delete-delay \ # remove deleted files from target after sync
    ${EXCL} \ # set exclusion parameters from above
    "$ENCRYPTED_DIR" \ # source (encrypted mount volume)
    "$TARGET" \ # remote target (see above)

## send notifications about status
if [[ $? == 0 ]]; then
    if [[ ! -z $WAYLAND_DISPLAY ]] || [[ ! -z $DISPLAY ]]; then
        send_notify "BackUpr" 0 "document-send" "Backup finished successfully." "[]" 0
    else
        echo "[INFO] Backup finished successfully"
    fi
else
    if [[ ! -z $WAYLAND_DISPLAY ]] || [[ ! -z $DISPLAY ]]; then
        send_notify  "BackUpr" 0 "document-send" "Backup failed!" "[]"
    else
        echo "[ERROR] Backup failed!"
    fi
fi

## Unmount encrypted file-system
fusermount -u "$ENCRYPTED_DIR"

# EOF backup.sh

Restoring a backup

Restoring the backup is relatively easy as well. For simplicity, I’ll assume that the entire backup should be restored. Take a look at rsync’s options if that is not what you want. Of course, you can also recover only specific directories or files.

First off we need the .gocryptfs.reverse.conf that the encryption tool created when we initialized the file system for the first time. That file contains meta information about the encrypted storage, but crucially not the decryption password. When mounting the file system, it has been put unencrypted in plain text into the encrypted storage as gocryptfs.conf and transferred to the remote storage.

In case you lost your entire local file system and want to restore it from the backup, we first need to fetch this configuration file:

1
2
3
4
5
6
7
#!/usr/bin/env bash

OLD_HOSTNAME="myoldpc"
TARGET_VERSION=1

scp user@my.remote.backup:${OLD_HOSTNAME}_${TARGET_VERSION}/gocryptfs.conf \
  ~/.gocryptfs.reverse.conf

Once this is done, we can mount the encrypted file storage again, however this time with writing permissions, so we can restore the files from the remote storage into the encrypted file system:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
#!/usr/bin/env bash

# create temporary directory as encrypted mountpoint
ENCRYPTED_DIR=$(mktemp -d)

gocryptfs \
    --rw \ # mount in read-write mode
    --reverse \ # use "reverse mode"
    --config "$HOME/.gocryptfs.reverse.conf" \ # config file
    "$HOME" \ # unencrypted directory to be backed up
    "$ENCRYPTED_DIR" && # temporary mount point
    echo "$HOME has been mounted in encrypted form to $ENCRYPTED_DIR" ||
    echo "Mounting $HOME to $ENCRYPTED_DIR failed!"

Now we can finally start to transfer the files. They will simultaneously show up as decrypted files in the home directory:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
#!/usr/bin/env bash

OLD_HOSTNAME="myoldpc"
TARGET_VERSION=1

# transfer files
rsync \
    --archive \
    --verbose \
    --update \
    user@my.remote.backup:${OLD_HOSTNAME}_${TARGET_VERSION}/ \
    ${ENCRYPTED_DIR}/

# unmount the encrypted storage
fusermount -u "$ENCRYPTED_DIR"