Several years ago, I set up daily automated backups for both my Linux desktop machines and of my web sites -- see my article on backups for details. I recently made a bunch of updates and changes to my web sites, at the same time as I switched to a less expensive web hosting account at the same web host I've been using for years (pair networks), so I thought it was probably a good idea to verify that the web site backups I have been making were actually working. This is known in the information technology industry as a "Disaster Recovery Exercise" -- in my case, on a small scale.
The first question to ask is: what do I mean by saying that the backups are "working"? Well... The reason for making backups is to be able to recover from problems. In the case of remote-hosted web sites, the types of problems I can imagine include:
- I could make a mistake that damages or destroys some pages or all of one of my sites, such as deleting some key files, mangling a content edit, or the like. Some of these errors I could probably take care of by suitable use of revision history on individual pages, but for some I would need to restore the files or data from my backups.
- Some kind of hacking could occur.
- The web hosting company could completely go out of business, or for some other reason I might not able to access the files and/or databases.
If my backups are "working", that means I should be able to recover any or all of my web sites to a reasonably current state, when faced with any of these types of problems. (Your definition of "reasonably current" may differ from mine... I don't make frequent updates to my web sites, so for me, "reasonably current" means within a few days. I make backups of the web content to my local machine every evening, and the individual sites save database backups every day or so.)
Here's how I tested my ability to recover from disasters:
- For each web site I tested recovering, I set up the web hosting and URL resolution environment on my local machine so that if I went to the site's URL, instead of my browser going out to the Internet to find the content, it would instead send the request to my local computer's web server, which would look in a local directory for the content. (I'm using Ubuntu and Apache, so this involves editing the /etc/hosts file to tell it that this URL maps to a local IP address, making a new "site" file in /etc/apache2/sites-available that tells my local Apache server where to find the content for this URL, making a symbolic link to this site file in /etc/apache2/sites-enabled to enable it, and then restarting the Apache server to make the changes live. There are a couple of examples of how to do this in a Stack Exchange post, and I'm sure in many other places on the Internet.)
- At this point in each test, I visited the URL in my browser and got the standard Apache "the site is there but there is no content" message, so I was able to verify that I was pointed at the local directory and not my live, working site.
- I made a "disaster" database and database user, on my local MySQL server, using PHPMyAdmin.
- Then I recovered each site's content.
Recovering a WordPress Site
First, I recovered the content for my one WordPress site, where my backup is a copy of the wp-content directory from that site (see the above-referenced article for details). To recover this site, I performed the following steps:
- Downloaded the current version of WordPress, and unpacked it to the local web space for my disaster recovery site.
- Replaced the wp-content directory with my backup.
- Found the latest database backup file inside the wp-content directory (these are made automatically by my site; again, see the article), and restored this to the "disaster" database in PHPMyAdmin. I ran into a problem with database collations similar to this article, due to differences between my MySQL server version and the one where my site is running, so I had to edit the database dump file in one spot to change "utf8mb4_unicode_520_ci" collation to "utf8mb4_unicode_ci".
- Copied the wp-config-sample.php file to wp-config.php and put in the database information.
- At this point, I could go to my site URL and see the home page, but none of the other pages worked, due to a missing .htaccess file. To restore that, I logged in and went to the Permalinks settings page, which told me what I needed to put into my .htaccess file.
- Now I could see all the pages of my site. If I were actually rebuilding it, I would need to update some file permissions, so that new files could be uploaded and new posts written, but I didn't bother to do this for the disaster recovery exercise.
Recovering Drupal Sites
I have two Drupal 7.x sites, and two Drupal 8.x sites. The backups in this case were the "sites" directories (or so I thought!), plus the separate private files area. To recover each of these, I used the following steps:
- Downloaded the current version of Drupal 7.x or 8.x, and unpacked it to the local web space for my disaster recovery site.
- Replaced the sites directory with my backup.
- Found the latest database backup inside the private files backup directory (these are made by the sites automatically; again, see article), and imported it into the "disaster" database.
- Edited the sites/default/settings.php file from the backup to point to my local "disaster" database.
- At this point, on the first Drupal site test, I found out that my backups for these Drupal sites only contained the sites/default directory, and not the sites/all directories. For the Drupal 8 sites, the top-level modules directories were also missing. That was a mistake! I could have recovered from this by downloading the modules that sites/all contained, but it was easier and better to go and edit my backup script so it would contain the right directories, and run the backup again.
- Now, I could see all the pages of each site I was testing. As in the case of WordPress site, if I had needed to make a fully functional site that I could add content and file uploads to, I would have needed to work on file permissions, but I did not bother with that for this test.