Website Backups

Website Backups

Thursday, May 12th, 2011

Backups are an interesting topic in any industry, but in the webhosting industry it perhaps goes to a new level of complexity, or at least some people like to think it does. Where should we start when talking about data backups for the hosting industry? The same place as a potential customer does, analyzing their needs and the order process. To do this we first need to discuss terminology.
Local Backups:
Local Backups, as the name suggests, consist of creating backups locally on the system where the data is residing. E.G. File1 is located in the directory /home/john/ (so, /home/john/File1). If I wanted to back this file up locally, I could back it up to say /home/john/File1backup. I could then delete File1 and have a backup of the file stored locally to restore from.

The limitations of local backup are plain to see. If you are backing files up locally, a hard drive crash, accidental data deletion (it happens!), hack or general data corruption could wipe out both your in use file and backup. You would then be left with no data. With these limitations in mind, a local backup is only really suitable for when you are making a minor development change which you could do with quickly undoing if it goes back. It is not suitable as an ongoing backup solution.

Yes, I have mentioned RAID here, even though technically minded people will be screaming “RAID isn’t a backup plan!!!”. I wholeheartedly agree with you, but it is being mentioned here as it is commonly confused with a backup plan by end users. RAID 1 mirrors data content over two physical hard drives, meaning that in theory one drive can fail and you will still have data integrity. A backup plan? Certainly not! The problem being everything is still stored locally, so if files are accidentally deleted/modified, maliciously defaced/deleted or, in rare cases, both drives fail at the same time, you are left with no data. At best it is a recommended redundancy configuration option as it will protect you from the most common cause of a physical server downtime, a single drive failing.

Master/Slave setup:
Again, this isn’t a data backup plan, it is a hardware backup plan. In a Master/Slave setup there are two physical servers, one of which is actively in use (Master). The Master Server syncs it’s content to the Slave Server near instantly and is configured in such a way that if the Master Server ever stops responding, let’s say from a network card breaking or any other hardware failure, the Slave Server binds the Master Server IPs, the IPs your websites/application are running from, and keeps your application/site up while the Master Server is repaired. Once repaired, the Master Server will sync any changes from the Slave Server and take over the running of the site/application again. This is entirely a hardware/network failover and isn’t suitable as a data backup plan. If a file is deleted or corrupted on the Master Server, it will instantly be migrated to the Slave Server. The end result is data loss.

Good old assumption. “I assumed you were backing up my data!”. Assumption is not a backup plan. It is ultimately the end user responsibility to verify that backups are offered and taking place.

Remote Site Backups:
Remote Backups is basically a term used for taking a backup from your in use server and storing it on a remote machine. This is an incredibly good idea, as if your physical machine fails, content is deleted or you are hacked, you will have a backup stored remotely, out of harms way, which can be restored. This is the first legitimate backup plan discussed so far and it is highly recommended that it is used.

Client Side Backups:
Even if you have Remote Backups stored on a remote server, away from where your data is held, it is always recommended to periodically take a backup yourself, which you then download to your local machine. That way if the company you are using disappears, along with their remote backup plans, you have a copy of your data. It is also highly recommended that you periodically manually check your backup files to ensure they are intact, as some files can corrupt during a backup process.

Okay, perhaps the most confusing out of everything mentioned, Cloud. I’m sure you have been told that Cloud is going to keep your data safe at all times, ensure you never have problems and also cure cancer for you as well (Okay, slight exaggeration there). Data in a Cloud environment is no more backed up than any other setup; it’s just more redundant in terms of hardware and, in some cases, software failure. If you delete a file on your Cloud environment, or you are hacked and data deleted, your data is still gone. Cloud will give you higher uptime, but it won’t backup your data for you.

After reviewing the various terminologies and solutions out there, one has to conclude that Remote Site Backups are a MUST requirement at  a minimum. Preferably multiple Remote Backup Sites in conjunction with locally stored copies of your backups to your local machine. Raid, Master/Slave and Cloud are only going to help you with environment uptime; they aren’t going to back your data up or act as a disaster recovery plan.

I through Assumption into the mix above because it is perhaps the most dangerous thing that happens when people are considering a new project or requirement. I can’t stress strongly enough that you shouldn’t assume anything. If an order process gives you the choice of Remote Backup Space, assumption and ignorance isn’t an excuse if you both don’t select it and then proceed not to configure it. Assuming you will never have a problem or assuming your host is taking backups on your behalf is a sure route to disaster.

I hope the above is helpful and will help keep your data, and ultimately your business, safe.

Comments are closed.