Backup, backup, backup!

Jul 28
Posted on July 28, 2008 17:25 in Offline, Technology

Quite a few people have been asking me recently how I do my backups, either because they are geeky enough to want to know another geek’s opinion, or because they simply have no idea where to start. So here is my view on making backups and a little insight into my backup strategy.

A bit of “theory”

The reason for making backups should be obvious to all, but apparently it isn’t so here is a little reality check. Let me put it simple: unless you have your data at least twice, you don’t “have” it at all. Whatever media you use to store your data on, it is not a question of “if” that data will ever be lost, but the question “when” it will. If a hard drive dies and you permanently lose some data, it’s your fault, not the disk manufacturer’s or anybody else’s.

Do realize that “backing up” your data to CD or DVD and then deleting the data from your hard drive does not count as properly backing up. You need to have every data twice!

So I need two versions of all my data?

Theoretically, yes. Having your data twice protects you against any of these 2 versions getting lost. In practice though, your main data storage (your pc or notebook) and your backup (your external drive) will often be in the same place together (your house, flat, office, studio). This means that you only need one flood, fire, thief or other disaster to destroy all your versions.

So, to cut to the point you better make sure you also have a second off-site-backup at a different location.

So if I backup I can retrieve any of my data I ever had?

Most backup solutions don’t offer backup versioning. To put this in human terms, this means that you can only go back in time to the last backup you made, nothing before that. Every backup in these cases overwrites the previous backup. There are backup solutions though that offer versioning like Subversion, CVS and the Apple Time Machine software.

The problem of versioning backups is that they tend to grow quickly when your data changes a lot. This is because every change is saved. In other words: if you download loads of videos per week and watch them the same week and trow them away, your disk space might stay constant but your backup will grow fast as it is saving all those videos. Versioning is often only used on very small (but important) data like documents because these sizes tend to be manageable.

So what’s a good “strategy” for me?

There are a few backup strategies that you can think of, and they basically vary in price, method, and time needed to maintain it. The most important thing though is to have a strategy in the first place. Let’s run through a few of the options we have for either the primary or secondary backups (the off-site backups).

Apple Time Capsule / Time Machine offers an all round solution for backups to Mac users using a combination of hardware and software. The Time Capsule is the backup medium, the Time Machine the name of the software. If you are a non-technical person and need a primary backup, use something like this because it will save you time. They are not cheap but it is seriously the easiest way to backup.

The Time Machine software (which works with other backup mediums like that other usb hard drive you got there) even does version backups which makes it possible to go back in time an hour, day, week or even month to retrieve something you lost. Backups are automatic and scheduled and although pricey they therefore offer the easiest solution for most non-geeky mac users. Beware that this does not offer you an off-site backup yet, but it is a good start.

CDs/DVDs tend to be the worst idea for a backup in my opinion. Although cheap they seem to degrade faster than other solutions and aren’t rewritable. Add to that the fact that you need to keep some record of what you stored where to be able to easily restore backups and you can see where I am going. Only use this option if you really have to or as a very cheap off-site solution.

USB/Firewire Hard Drives are the most common used backup solutions among geeks. They are (relatively) portable and therefore suitable for on and off-site backups. Almost every operating system has its own backup software but there are also loads of applications around that do the job even better. For Mac I advise using something like Time Machine for version backups, or SuperDuper! for normal backups.

A little trick is to find a piece of backup software that is able to do an incremental backup. An incremental backup saves you time because it only syncs the changes to your backup instead of doing a full wipe. With an incremental backup only the first backup takes a while to complete but any after that is just a question of minutes.

NAS or Network Attached Storage is just an expensive word for an external hard drive that you access over the computer’s network instead of via USB or Firewire. Although slower to access (especially if you are using Wireless networking) it gives the freedom of using it on any PC, any time, without needing to actually connect the drive manually to your computer. This last thing is very handy for people using a notebook.

Raid Arrays (or Drobo) are a special kind of backup solutions as they create backups within themselves (aka redundancy). This means that they are not just useful for backing up to, but also for primary storage. By using RAID technology these devices can turn a set of 3 hard drives of for example 200GB into a redundant storage of about 600GB or more. Even if one of the disks died you would still be able to access all your data. All you would have to do to make your data redundant again is to replace that broken disk with a new one.

Although interesting, setting up and maintaining a RAID array can be quite hard and tough for a non-geek. With that in mind the people from Data Robotics came up with the Drobo (Data Robot) which makes RAID easy for everyone. it is expensive again, but if you want a nice external storage that is redundant on its own, and easy to maintain, definitely go for this.

So what do you do?

I am a student, so I’m always low on money, which means I don’t have a Time Capsule, Drobo or RAID array as they are all quite expensive. Instead I went for a quite traditional solution.

First off, I have an external Firewire hard drive that I can back up to. It’s a simple one-on-one backup with no versioning. I use SuperDuper! to make so-called “Smart Backups” which is just a fancy word for an incremental backup. My backups aren’t automated but I tend to do a backup about every 1 or 2 weeks.

Next to this backup I got a second partition for storage of old documents and projects. As this would be a storage place this would mean they wouldn’t be backed up, so I have another (smaller) USB hard drive that this storage is backed up to about every month.

Then I have a backup of some of my most important documents at my parents on an older 80GB hard disk. This backup is not very current but I see it like this: if my something happened to both of my backups, I am probably in more trouble anyway and this one simple backup will make it sure that I won’t have to go back to the digital stone age again. This drive is updated every time I go to my parents.

Then I have some additional online backup methods. For my photos I use Flickr and my videos I use Vimeo. Although these sites might not look like backup solutions in the first place, essentially they are. Not all my photos end up online but at least the best and emotionally most important are. Most of my programming code goes into Subversion anyway, and the same even goes for my blog (more on that here). In theory I could put any data into subversion, including my documents folder, but in general it is hard to find cheap online mass storage.

Conclusions

I think I got a rather good backup solution going on, even though it’s quite a pain to maintain. I think though that most people don’t have a strategy in the first place which is a bad thing. Let me know what your strategy is and maybe we can learn from each other. Don’t be afraid to ask for any more details or help, I am glad to get people on the right track.

  • Hani

    I read your post with interest since I am currently deciding on how to backup the data on my MacBook. I would like to use Time Machine, but I never physically connect the MB to my external HD at home. That external HD is connected to my MacMini and shared on my home network.
    It would be perfect if it was possible to use a shared network drive (one of the partitions) as the Time Machine backup drive.

    Also, I was wondering about the reasons that made you choose SuperDuper over Time Machine.

  • Hani

    I read your post with interest since I am currently deciding on how to backup the data on my MacBook. I would like to use Time Machine, but I never physically connect the MB to my external HD at home. That external HD is connected to my MacMini and shared on my home network.
    It would be perfect if it was possible to use a shared network drive (one of the partitions) as the Time Machine backup drive.

    Also, I was wondering about the reasons that made you choose SuperDuper over Time Machine.

  • http://cristianobetta.com Cristiano Betta

    Excellent questions. The “problem” with Time Machine is that it will only back up to a external USB/Firewire hard drive, or over the network to a Time Capsule. There are some hacks to get it to back up to a network share as you are proposing but this has some issues.

    I tried this and the biggest issue are the speed of the network (I take it you want to do this wireless), and the fact that it is NOT a one-on-one backup. I like one-on-one backups as they are bootable and enable me to play around, trying to get a Ubuntu install to work on my MB, etc. So, I chose for SuperDuper!.

  • http://cristianobetta.com Cristiano Betta

    Excellent questions. The “problem” with Time Machine is that it will only back up to a external USB/Firewire hard drive, or over the network to a Time Capsule. There are some hacks to get it to back up to a network share as you are proposing but this has some issues.

    I tried this and the biggest issue are the speed of the network (I take it you want to do this wireless), and the fact that it is NOT a one-on-one backup. I like one-on-one backups as they are bootable and enable me to play around, trying to get a Ubuntu install to work on my MB, etc. So, I chose for SuperDuper!.

  • http://cazmockett.com/blog/ Caz Mockett

    Well I’m a PC girl (so shoot me), therefore these Mac solutions aren’t feasible for me. But what I do have is two external USB hard drives which I use for general data and my photographs. Once there are enough files from the photo drive to burn a DVD, that’s a second backup. The discs live in a fireproof box (I should probably move them to my parents but haven’t got round to it yet). Just my 2p :-)

  • http://cazmockett.com/blog/ Caz Mockett

    Well I’m a PC girl (so shoot me), therefore these Mac solutions aren’t feasible for me. But what I do have is two external USB hard drives which I use for general data and my photographs. Once there are enough files from the photo drive to burn a DVD, that’s a second backup. The discs live in a fireproof box (I should probably move them to my parents but haven’t got round to it yet). Just my 2p :-)

  • http://cristianobetta.com Cristiano Betta

    @caz Well I obviously highlighted some of the Mac solutions here, but in general my talk above is a non-platform-bound story. Maybe you would like to share what software you use for your backups on PC?

    You burn to CD as a second backup??? good job! I just don’t like CDs and DVDs (they have failed to often on me) and in general I can justify upgrading my largest HDD to a new one to have another disk left for off-site backups.

  • http://cristianobetta.com Cristiano Betta

    @caz Well I obviously highlighted some of the Mac solutions here, but in general my talk above is a non-platform-bound story. Maybe you would like to share what software you use for your backups on PC?

    You burn to CD as a second backup??? good job! I just don’t like CDs and DVDs (they have failed to often on me) and in general I can justify upgrading my largest HDD to a new one to have another disk left for off-site backups.

  • http://reinier.zwitserloot.com/ Reinier Zwitserloot

    First of all, SVN/CVS/Any version control system are NOT BACKUP SYSTEMS. You should just nix that from this article.

    I do have some advice though: Usually the actual data that you really cannot possibly go without isn’t too large. If that’s true, back that stuff up over the internet. Simple example: All the source code I ever wrote exists on my macbook, on both backup partitions of my backup drive, the house time capsule, and on our server. Even if the entire house falls down, that server will still have the data.

    This is a lot harder if you want to keep your videos around, obviously. Documents, your emails, and your favourite pictures, that’s reasonable to upload to a remote server via the net.

  • http://reinier.zwitserloot.com/ Reinier Zwitserloot

    First of all, SVN/CVS/Any version control system are NOT BACKUP SYSTEMS. You should just nix that from this article.

    I do have some advice though: Usually the actual data that you really cannot possibly go without isn\’t too large. If that\’s true, back that stuff up over the internet. Simple example: All the source code I ever wrote exists on my macbook, on both backup partitions of my backup drive, the house time capsule, and on our server. Even if the entire house falls down, that server will still have the data.

    This is a lot harder if you want to keep your videos around, obviously. Documents, your emails, and your favourite pictures, that\’s reasonable to upload to a remote server via the net.

  • http://cristianobetta.com Cristiano Betta

    Why can’t you use a versioning system as a backup solution for just those small documents? Isn’t it a good thing to know that your code also resides as a backup on the SVN/CVS server before you managed to backup your entire drive?

    How are your thought about rsync?

    Furthermore I totally agree: when you got a proper strategy you will know because your most important data will be backed up most and you least important data will be backed up the least (or won’t be backed up at all).

    I think the hardest thing though for some people to understand is what their most important documents are. Although important, documents you created in the last week aren’t the most important. They are still fresh on your mind and can with some pain and sweat be “recreated”.

    Documents from a year back that you might one day need are way more important as they are often impossible to recreate.

  • http://cristianobetta.com Cristiano Betta

    Why can’t you use a versioning system as a backup solution for just those small documents? Isn’t it a good thing to know that your code also resides as a backup on the SVN/CVS server before you managed to backup your entire drive?

    How are your thought about rsync?

    Furthermore I totally agree: when you got a proper strategy you will know because your most important data will be backed up most and you least important data will be backed up the least (or won’t be backed up at all).

    I think the hardest thing though for some people to understand is what their most important documents are. Although important, documents you created in the last week aren’t the most important. They are still fresh on your mind and can with some pain and sweat be “recreated”.

    Documents from a year back that you might one day need are way more important as they are often impossible to recreate.

  • http://www.americanrecordablemedia.com/ DVD Duplicator

    Those are pretty good backup devices.

  • http://www.squidoo.com/easy-backup-wizard- easy backup wizard

    thanks for posting this, I have been waiting for a while for you to post how you do your backups