I've been working lately on setting up robust backups for my Ubuntu Linux computers, and I thought that some of the thought processes, procedures, and scripts I'm using might be helpful to others. This article describes the whole process. Some level of background knowledge of Linux, and in particular Ubuntu, is assumed, and any scripts or advice provided is on a "use at your own risk" basis.
Many thanks to Zach Carter for ideas, scripting help, and general support in setting up the processes described here!
What are backups for?
Before figuring out how to do backups, you need to understand why you are making backups and what types of problems you hope to recover from, because the right backup solution (or solutions) depends on this. Here's my list:
- Recover data from an error you made
- Example: You deleted or modified a file and want to recover its previous state.
- Recover from an operating system configuration problem
- Example: You installed some new software or updates on your computer that aren't working, so you want to revert to the previous configuration.
- Recover from disk failure
- Example: Your hard drive failed, and you want the new drive to be just like the old drive (data, software, etc.).
- Recover from other hardware failures
- Example: The display on your laptop has died, and you want to buy a new laptop and have it be just like the old one (data, software, etc.).
- Recover from a remote hosting failure
- Example: Your web hosting company goes out of business, and you need to get your web site live on another server quickly.
You also need to figure out how much down-time you are willing to have for each of these situations. For instance, if you are willing to spend a day installing and configuring the operating system and software any time there is a problem, you could just back up your essential data using a custom script that makes a compressed archive of this data every night. But if this is too much time to recover from configuration problems, you need a different solution.
And finally, you need to figure out how often your backups need to run, and how automated they need to be. The frequency depends on how much data you are willing to lose (for example, if you do backups every evening, that means you are willing to risk losing a file you worked on for most of the day before your disk crashed or you deleted it mistakenly). The level of automation will determine how much work it is for you to maintain backup integrity, but if you make things completely automated, you also need to verify from time to time that your backups are actually running.
Recovery from configuration problems: alt partition
The first type of problem I'd like to consider is how to recover from a configuration problem. By "configuration problem", I mean that you've installed new software, updated software, or changed the system configuration in some way, and your computer no longer does what you need it to do. This is especially serious if you can't easily reverse the change (e.g., it won't boot at all, etc.). If you keep up with the updates your computer prompts you to do, you'll be exposing yourself to this problem fairly frequently, so it's a good idea to have a quick way to recover from a problem like this.
The method I use is to maintain an up-to-date, bootable "alt" partition with a backup copy of the root partition (details below). Then if I have a configuration problem, I can just boot into my "alt" and immediately be up and running. If you instead just relied on file backups, to recover you would need to boot from a CD or jump drive, mount the root partition and the drive where you have your backups, copy in the files, and then boot again, which would be a much slower process. (There are also other methods that would give you the same result -- check out this Ubuntu forum on backups for several suggestions.)
Here are the details of the "alt" partition process I use. First, some setup:
- When installing Linux (I think you can also do this after installation with a partition editor utility), set up separate partitions for / (root), /boot, /alt, and /opt. You'll want about 10 GB each for / and /alt, about 500 MB for /boot, and the rest you can just put in /opt (aside from your swap space, the boot loader, and perhaps partitions for other operating systems). Install the operating system to the / partition, and leave /alt empty.
- I actually have two "alt" partitions -- /alt and /alt2 -- so I can alternate which one I am using as my latest backup and have two possibilities to boot into. But in the instructions below, you can just replace /alt with /alt2 if you are building your /alt2.
- After installing, move your home directory to /opt/home so it is not part of the / partition. You'll need to make a symbolic link between /home and /opt/home so that your home directory is recognized.
- In general, try to keep your data on /opt so it is separate from the root partition, and don't ever write files directly to the /alt.
Now, the process to use to refresh your "alt" before making configuration changes:
- Copy all the files in / to /alt using rsync. To do this, I use a script (attached below), which lets me type:
./rsync_Slash_to_Alt / /alt
Note that if you want to use this script, you will also need to get the build.exclude script, and edit configuration options at the top of these scripts. Use at your own risk! - Fix the fstab. The first time you run this script, you will also need to manually copy the /etc/fstab file to /alt/etc/fstab, and edit it (using your favorite plain-text editor) so that it will make your /alt bootable as /. Basically, you'll need to find the lines that mount one device as / and another as /alt, and switch them around so the first device is mounted as /alt and the second as /. If you use my rsync script, subsequent runs will exclude the etc/fstab file from being synchronized, so the edits you made should be left alone. But if you've recently done a major operating system upgrade, you might want to compare the /etc/fstab to your older /alt/etc/fstab and make sure nothing else has changed that you need to copy in.
- Ensure that the kernel compatible with your /alt is preserved (operating system upgrades sometimes delete old kernels). The kernels are stored in /boot, and consist of a bunch of files with the same suffix (the current suffix is shown by "uname -r"), such as vmlinuz-2.6.35-22-generic, initrd.img-2.6.35-22-generic, etc. To preserve a kernel, what I suggest is making a subdirectory /boot/alt, and copying all the kernel-related files for the kernel you want to preserve into that directory with a different suffix. For instance, make files /boot/alt/vmlinuz-2.6.35-22-generic-alt, /boot/alt/initrd.img-2.6.35-22-generic-alt, etc.
- Create a grub menu entry to boot into /alt. On Ubuntu, this is currently managed by the update-grub command, which uses files in the /etc/grub.d directory to build the /boot/grub/grub.cfg file. What I did is to add entries to the /etc/grub.d/40_custom file (copied from the existing /boot/grub/grub.cfg file). You'll need to copy a group of lines starting with menuentry and ending in a }, and modify the description on the first line. Then you will need to modify both the "linux" line and the "initrd" line to point to the alt kernel you made in the previous step (e.g., change /vmlinuz-2.6.35-22-generic to /alt/vmlinux-2.6.22-generic-alt -- file names in these lines are relative to /boot). You will also need to change the root device in the "linux" line to point to your /alt device (you can find the UUID in the fstab file, see above). After making the changes, run the "update-grub" command, which will compile a new /boot/grub/grub.cfg file from the files in the /etc/grub.d directory.
- Test! Reboot your computer, choose the new grub entry you created from the boot menu, and verify that you were able to boot successfully. You should run the "df" command after booting to verify that you have your alternate partition mounted as /, and you should run the "uname -r" command to verify you are using the correct kernel.
Now, it should be safe to make your configuration changes. If something gets screwed up, you can boot into your /alt partition, and you'll be up and running very quickly. The thing you need to do is make sure that before each time you make any updates or configuration changes, you follow all the steps above to refresh your /alt partition and make sure it's bootable. This is a manual process, but it works for me.
Recovering local databases
The second problem I'd like to address is local data recovery. Most of the data that I might need to recover is stored in ordinary files, so in order to recover this data (in case of a disk failure or user error), all I would need to do is to make regular backup copies of the files, which I'll cover in the next section. However, some of the data that I want to be able to recover is stored in local MySQL or PostgreSQL databases, and this needs special treatment. The reason is that although in principle, the MySQL and PostgreSQL databases store their data in the file system, I haven't found a reliable way to recover this data from a backup of the files in the file system. Also, you might only want to recover a certain database rather than the entire set of databases, so it's handy to make "dump" files and use those for recovery (a "dump" file is a text file containing SQL commands that will re-create a database or a set of databases).
In MySQL, this is fairly straightforward. Using PHPMyAdmin (or the mysql command), you can set up a MySQL user and grant it "SELECT" permission for all databases. Then you can run the command:
mysqldump --skip-lock-tables --user=USERNAME --password=PASSWORD DATABASENAME | gzip > FILENAME.sql.gz
for each database you want to back up. I do this in a script, where I first run a MySQL "show databases" query to list the databases, and then run mysqldump on each one (the script is attached below).
In PostgreSQL (version 8.4), this method does not work. First, there is no way to set up a user that has select permissions on all current and future databases and tables, except the root user, so you pretty much have to just use the root user to make backups. Second, it is not possible to supply the password via the command line in PostgreSQL -- you have to use a .pgpass file. Third, the meta information (users, roles, etc.) is not stored in a database table, and the only way to dump it is with the pg_dumpall command (which puts all the databases and meta-data into the same backup file). And finally, it appears that you must be logged in as the Linux user 'postgres' (the root PostgreSQL user) in order to run the pg_dumpall command. Luckily, at this time I am only using PostgreSQL for testing whether the Drupal modules I maintain will run on PostgreSQL, so I don't really have any critical or sensitive data stored in PostgreSQL. So I was willing to put my PostgreSQL root password into my backup script, and willing to live with the fact that recovering an individual database might not be easy/possible.
Given all of that, my backup solution for database data is to have cron run my database backup script (logged in as user "postgres") periodically, which will save recovery files in my file system, where they will be backed up by my file backup scripts (see sections below). If I need to do a quick database recovery, I can use the dump files, and if there is a hardware catastrophe, I can first recover the file system and then use the dump files to recover the databases. You can download this script below; be sure to edit the configuration options at the top before running, and use at your own risk.
Recovering local files
The next piece of my backup/recovery process is to make sure I have backup copies of files (including the /, /boot, and /opt partitions). This will allow me to recover from a hardware failure (disk drive or other hardware), and to recover individual files that I've mistakenly deleted or modified. So, I need the backups to be stored on a separate device. I decided that daily backups were frequent enough for my purpose, that I wanted them to run automatically, and that I would keep about 30 days of backups.
Since I have several computers, I decided that the easiest way to manage this would be to purchase a network storage device (I guess I could have built one myself, but this seemed like a lot of work). I decided on an Iomega StorCenter ix2, which was around $250 (February 2011), has two 1 TB disks in a redundant RAID array, was favorably reviewed on CNet and other sites, has a user-friendly configuration tool, and supports NFS and Linux. Unfortunately, although the device supports Linux just fine, the documentation and tech support doesn't officially support Linux (the tech person I chatted with once made that clear to me before actually turning out to be fairly helpful). So, make the same choice at your own risk (so far I am happy with it).
Once you have storage external to your computer, there are several viable choices you could make to back up your files:
- Use a utility such as SimpleBackup (which I've used in the past), which does automatic incremental backups, stored in .tar.gz files, and seems to work pretty well. It also has a nice Restore utility that lets you easily restore individual files that you inadvertently modified or deleted. The disadvantage is that in order to recover files from a disk crash, you probably would need to use its Restore utility, which complicates the recovery process.
- Write a simple script that would just make a .tar.gz of each partition and store it on the external drive (I've also used this in the past). The advantage of this method is that you don't need any special software to recover from a disk crash, but it makes recovering one particular file more complicated. It also uses a lot of space, since the entire disk is stored each time you make a backup.
- Use rsync to make an intial copy of the files, and then on subsequent backups, tell it to make hard links to the previous backup rather than duplicating the files that haven't changed. This saves a lot of space (although directories themselves will be duplicated each day, since they can't be links), and has the advantage that if you go to a particular day's backup directory, you have what looks like regular, uncompressed copies of all the files, so recovery of anything from a single deleted file to a hardware failure is quite easy and requires no special software.
I've adopted the rsync method. Details of how I set this up:
- Installing the StorCenter was easy: just plug it into an electric outlet and a router, and it's up and running.
- The StorCenter has a web interface you can use to configure it (there is also a manager utility for Windows, and supposedly one for Linux, but I couldn't get it to run). All you need is the local IP address, which you can find by a command such as
nmap 10.0.1.*
assuming that your router uses the 10.0.1.* subnet. Once you find the IP address, just point your favorite browser at that IP address, and you'll enter the setup interface. - Some settings you will likely want to change (some will require a reboot of the storage device):
- Set up a password for the StorCenter.
- Set up a static IP address for your storage device, under Settings / Network Services / Network Settings. Otherwise the router might assign it a different IP address from time to time. You probably will need to set up static IP addresses for the computers on your network as well, if you haven't already done that.
- Turn on NFS under Settings / Network Services / NFS. You probably want to check the "allow root access" box.
- Create one or more "shared drives" (externally-accessible storage directories really), under Shared Storage / Add Drive. Be sure to check the NFS box, and allow the drives to be accessed via NFS from your computer's IP address. I set up a separate drive for each computer I needed to back up.
- On your Linux computer, once the StorCenter is configured, you should be able to type
showmount -e IPADDRESS
(where IPADDRESS is the static IP you assigned to your storage device) to find out the correct path to use for mounting your storage device. Mine looks something like:
/mnt/soho_storage/samba/shares/DRIVENAME
where DRIVENAME is the name I gave to the shared drive in the StorCenter configuration screens. (Note that the address /nfs/DRIVENAME that the StorCenter utility says you can use did not work for me at all.) After obtaining the name, you can test whether it can be mounted by typing
sudo mount IPADDRESS:PATH_FROM_SHOWMOUNT /place/to/mount/it
Then copy a file to that directory, and you will be able to see the file from the StorCenter web configuration tool. I also verified that the StorCenter file system supports hard links, by making one with the "ln" command, and then using the "stat" command to verify that the inode number (internal file ID number) of the file and hard link were the same. - Make individual backup directories on the backup disk for the root, boot, and opt backup directories, and also create some other local backup directories used by the scripts (see configuration sections in scripts for details).
- At this point, you can download the backup script (you'll also need the build.exclude script and the is.bak.mounted script), edit the configuration sections at the top, and try them out. Use at your own risk!
Recovering remotely-hosted web site
I use a shared web hosting company to host poplarware.com (and associated sites) -- actually, I have three Drupal and two WordPress sites. In case the hosting company has trouble or goes out of business, I need to make sure I have backups sufficient to rebuild these sites quickly. This is complicated by the fact that Drupal and WordPress use a mix of files and databases to generate the sites.
So, my backup strategy for these sites is:
- Use a database backup plugin/module to make regular database backups that are stored on the hosting company's file system. This protects me somewhat against the hosting company having database issues, since I should be able to restore my database to my last backup if something gets corrupted. The Drupal module I use for this is Backup and Migrate, and the WordPress plugin I use for this is WP-DBManager.
- Use a custom script I wrote that lets me download (by visiting a URL and entering a password) a tar-gz archive of the Drupal "sites" directories and the WordPress "wp-content" directories. These directories contain the automatic database backups, uploaded images and files, plugins, modules, and themes -- basically everything except the WordPress and Drupal software itself. You could also just make a zip of all the files from your web root, which might be easier.
- I have a cron job that visits the download URL periodically and saves this file on my local computer, so that if the hosting company vanishes into thin air, I will have a local backup of my site.
Other Steps and Notes
You can download the scripts I use below. Use at your own risk! You'll need to edit the configuration section at the top of each script, probably remove the .txt file extension, and make them executable. And definitely read them through and test before deploying!
With working backup scripts in hand, there are a few more things to do:
- Make the scripts run automatically via cron. You can use "sudo crontab -u root -e" to edit the root crontab file, and add an entry or entries to run your backup scripts at the frequency you want.
- Add logic that will clean out backups older than a certain age. I haven't done this yet, but maybe I'll post an addendum when I do.
- I also added a line to my regular backup scripts to dump out the list of installed software, so that if I need to re-install the operating system, at least I have a list of the packages that need to be reinstalled. The command is "dpkg --get-selections", and then you can redirect the output to a date-stamped file. If you should need to re-install the operating system and want to get all the same software installed, use "dpkg --set-selections" to tell dpkg what software you want, and then "apt-get dselect-upgrade" to tell apt-get to install everything. Check the man pages for dpkg and apt-get for more information.