Java Eclipse Linux Operating Systems Web Technology Software Software Engineering Computing Societies

Encrypting and Backing up Critical Data in Ubuntu

Portable computers such as laptops and netbooks have become very popular in recent years. Increasingly laptops have become people's primary computers. The convenience and portability of laptops also make them vulnerable to being lost or stolen. Valuable, personal information is often stored on these devices. Ensuring that this information is properly protected and recoverable even in the case of loss or theft is thus very important.

This article provides a step by step guide on how you can use ecryptfs to encrypt your data to prevent unauthorized access, and rsync to backup your data, making it recoverable even in the case of loss or damage of the laptop.

ecryptfs Overview

ecryptfs is a "POSIX-compliant enterprise cryptographic stacked filesystem for Linux". (See ecryptfs about page for more details on its history.) It is chosen by Canonical as the default encryption technology for home directories Ubuntu, and Google for Chrome OS, and has become widely used in the Linux community.

ecryptfs is a file system rather than block level encryption technology. There is no need to predetermine the size of the encrypted partition. It allows different sections of the drive to be encrypted with different passphrases, and protected independent from each other. It also makes it easy to make incremental, encrypted backups, as we shall see later. One downside is that ecryptfs is not as performant as some competing technologies. (See the LinuxUser article and the ecryptfs FAQ for a comparison with various other ecryption technologies)

ecryptfs offers considerable flexibility in determining which directories should be encrypted. This article focuses on encrypting the entire home directory. Other options, such as encrypting a Private directory, are also possible.

Encrypting the home directory in Ubuntu

Ubuntu's installer makes it very easy to set up a user whose home directory is encrypted. All you need to do is to check the Encrypt Home Directory check box. A passphrase for encrypting the home directory is automatically chosen, and once it is set up, the home directory would be mounted automatically on login, making the decrypted information available to the user transparently.

If you forgot to set up encrypting during installation, you can use the ecryptfs-migrate-home command to set up encryption. See How to Encrypt Your Home Folder After Installing Ubuntu for more details. You can also add additional users with encrypted home directory with the following command

sudo adduser --encrypt-home [user_name]

The first time you log on as the user with an encrypted home directory, Ubuntu will prompt you to Record your encryption passphrase. This is important if anything goes wrong and you need to recover the data. This is also useful for mounting backups, as we'll describe later. Click on Run this action now , supply the password of the account as the passphrase, and write down the encryption passphrase in a safe place for future use.

The home directory would be mounted automatically when the user logs on. While the home directory is mounted, its contents are available in decrypted form on the machine, protected from other users only by the Unix file permissions system. This is probably not a big deal, but if your data is especially sensitive you might want to consider logging out when you're not actively accessing the data.

One other note is that, if you want to be able to ssh into your account without staying logged on, you might need to do some additional set up. See SSH key authentication with encrypted home directories for more details.

Filesystem Layout with ecryptfs

This section contains some more details that could be helpful in understanding how ecryptfs works, as well as how an encrypted backup can be done. If you are interested only in practical advice, feel free to skip ahead.

For users whose home directories were encrypted using ecryptfs, the actual data in the home directory is not stored in the normal location ( /home/[username] ). Instead, they are stored in an encrypted form, under the /home/.ecryptfs/[username]/.Private directory. /home/[username] is just a mount point for the decrypted data.

There is an additional directory /home/.ecryptfs/[username]/.ecryptfs that is used to stored a few metadata items about the encryption, including the mount point location, the wrapped passphrase (wrapped using the user's password), and the mount signatures. For more details, see the man page for ecryptfs-setup-private .

The important thing to note is that the ecryptfs passphrase is key to providing access to the encrypted directory. It is important to have it be available when needed during recovery. It is also important that unauthorized access is not given to the passphrase. Depending on the strength of the password, the wrapped passphrase file could be a point of weakness in the security chain. So treat it with care.

On mounting the eryptfs directory, the .Private and .ecryptfs directories are available as a symbolic link of the same name in the home directory.

Encrypted Backup of the Home Directory

Encrypted data is more sensitive to data corruption than normal files. A single bit flip could cause the entire file to be unreadable. Maintaining proper backups of the data is even more important than normal.

With private home directories protected by ecryptfs, the easiest thing to do is to backup the data in /home/.ecryptfs/[username]/.Private . Accessing that data does not involve mounting and decrypting the data first. The data remains fully encrypted throught the entire process and is thus appropriate for storage even on external, portable drives, or on a remote server.

A description of a good backup strategy for ecryptfs can be found on Dustin's blog. In this section I provide a variation that provide an Apple Time Machine style backup that is fully encrypted.

rsync is an extremely versatile backup and file syncing tool on Linux. Time Machine for every Unix out there describes how it can be used to mimic the behavior of Time Machine. The basic strategy is to use the -P and --link-dest flags, which will cause rsync to use hard links if the file has not changed. We can then create a new directory for each backup instance, and still keep the disk usage for the backups manageable.

I wrote a small Python script to automate the process. Note that the ecryptfs passphrase is not necessary for the backup at all. The encrypted files are simply copied as black boxes.

You can automate the backup process by setting up a cron job. Using this script. To set up an hourly run, log in as root using sudo su -, and edit the crontab using crontab -e. The following crontab row will give you the desired behavior

0 * * * *	[path_to_]/backup_home.py

Mounting and Restoring Data from the Encrypted Backup

To view or restore data, you'd need to mount the backup directory. You can either choose from the latest backup, or any one of the historical backups if you prefer. It is also a good idea to periodically verify that the backup is done properly, before you need it.

Assuming you have the cryptfs passphrase available, you should first issue

sudo ecryptfs-add-passphrase --fnek

This command will print out the ecryptfs signatures associated with the passphrase once it is done. By default, file name encryption is enabled, so in particular, you'll need to use the fnek signature later.

The next step is to mount the encrypted directory. To mount the latest backup for username, run

sudo mount -t ecryptfs /[backup_location]/[username]/Current/.Private/ [mount-point]

Assuming you set up the home directory using Ubuntu's installer or adduser tool, you can take the defaults in interactive prompts that follow. Note that file name encryption is turned on by default, so you probably want to select y to that question and supply the appropriate fek siganture.

That's it. You can now cd to the directory and examine and restore the data as appropriate.

For more detailed instructions, see Rescue Your Data from EcryptFS

Appendix

Formatting an external harddrive

External hard drives can be used to effectively store data backups. Before an external harddive can be used, they must be formatted. The article Linux: Partition and format external hard drive as ext3 filesystem

If you prefer to use GUID Partition Table for your drive, then fdisk wouldn't work. You'll need to use GNU Parted tool instead. Note that currently mkpartfs is not completely mature, so it might better to run mkfs outside of parted. An example follows (assuming you want the ext4 filesystem)

sudo parted /dev/[device-name]
(parted) mktable gpt
(parted) mkpart primary ext4 [start]GB [end]
sudo mkfs.ext4 /dev/[device-name]1

Setting up fstab with UUID for the parition

The /dev path that a USB drive is mounted on depends on what other USB devices are available. As a result, it is in general better to use UUID instead of the /dev to specify the mount path of each USB drive. To find out the UUID for a USB drive, run

ls -l /dev/disk/by-uuid

Assuming that you are running this in the same session as when you formatted the disk, the name of the symbolic link that points to the /dev/[device-name][n] is the UUID you're looking for.

Update the /etc/fstab file by inserting a line along the lines of the following, substituing in the UUID you found above, and the mount point that you want (which should be an empty directory in the file system).

UUID=[UUID]       [mount-point-path]    errors=remount-ro       0       1

Time Machine Style Backup of the Entire Systems

The approach I take is adapted from the one described Full System Backup with rsync. A known set of directories, such as /proc and /mnt are excluded from the backup, since they don't make much sense, and in some cases could lead to recursion. To generate a Time Machine style backup, the --link-dest parameter is also specified. This results in a command like the one below.

rsync -aAXv --exclude=/dev/* --exclude=/proc/* --exclude=/sys/* --exclude=/tmp/* --exclude=/run/* --exclude=/mnt/* --exclude=/media/* --exclude=/lost+found --exclude=/home/*/.gvfs --exclude=/var/lib/pacman/sync/* --link-dest=[last_backup_directory] / [backup_directory]

In practice, I use a similar script as the one I used for backing up home directory to handle creation of backup directory names with timestamps, creation of convenience soft links, and to handle any error conditions.

If a single backup is good ...

A Time Machine style backup is great for keeping historical snapshots, making it possible to go back to a previous state of the data. However, there are many different ways the backup can go wrong: hard drive failures, theft, physical disasters, human error, software bugs or malware can all cause the backup to be inaccessible or lost.

I recommend having at least two backup copies of your home directory and any other critical data. It's also a good idea to store the backups in two separate locations to minimize the chances of both copies being destroyed at once. In my case, I store one backup harddrive at home, another backup harddrive at work, and rotate between them periodically. This way, in the worse case, you'd not lose more than the rotation period's worth of data. The backup copies are fully encrypted to make sure even in the case of theft or loss of the physical media, the chances of other gaining unauthorized access to your data is minimized.

I also keep a full system backup in a third drive at home. This allows me to preserve any system configuration and make it easier to recover in case I encounter a bad hardware or software issue that require reimaging my machine.


Written by Mike Kwong