Why we still need old-fashioned backups: A cautionary tale

Submitted by trey on Wed, 2010-02-03 07:45.
“Backups are becoming less and less necessary these days”, I'm told. High availability, cheap disk mirroring and snapshots, cloud storage, data syncing between services—all these factors make old-fashioned backups—the offline, offsite, multi-tier kind, probably to tape, an expensive and cumbersome luxury that is neither affordable nor needed today.

I just got bitten, hard, by the results of that sort of thinking. I think a cautionary tale is in order to remind you of why, exactly, mirroring technologies (of which syncing, cloud storage, etc. are all just degenerate or fancier forms) should not be a full replacement for backups in most cases.

A few days ago, my email and calendar was migrated from one service to another.  I felt fine about this before the migration, because my email is cached in multiple sources, and my calendar gets synced, by concatenation, to my OS’s calendar, to Google, to my phone, etc.  Besides, I wasn’t aware of what sorts of backups the mail group had, but I was sure they’d done their due diligence.

The morning after the overnight migration, I changed my password as required, logged into my email, and discovered—a blank calendar.  I checked my phone.  Blank calendar.  Google? Blank.  The OS calendar? Ditto.

Something went awry with my migration.  (I’m not casting any sort of aspersions at my company’s mail group or other sysadmins; I have not spoken directly to them and have no clue what sort of failure occurred.) All the high-availability mirroring and syncing are meant to protect against hardware and connectivity issues.  Unfortunately, as this was a software issue—something in the migration scripts deleted all my calendar items—all the other mirrors and services happily followed suit in deleting my calendar, doing exactly what they were supposed to.

I had my own sorts of backups, such as the ones of my laptop, but since they aren’t the authoritative source of the calendar entries, all I can do is use them to view the lost entries—if I actually try to put them back on my calendar, they just get deleted again the next time they sync.  What’s needed, and doesn’t exist, is an offline backup of my calendar from its official source—my old mail server.  Without that, all this syncing and mirroring is useless to get me back to before things broke.

As sysadmins, we have reason to focus on hardware failures—if we don’t, nobody else will.  And new technologies are making hardware failures easier and faster to mitigate all the time.  But software failures still happen.  And it’s important to remember that the only way you’re going to retrieve data that gets hosed by faulty software is the old-fashioned way—regular, offline backups, stored safely for some time and regularly tested for recoverability.[1]

[1] Snapshots, for those fileserver products that support it, can make this more convenient.  But the very nature of a snapshot’s online access, I’d argue, means it shouldn’t be your last line of defense.  Maybe these days you make your level-n “backups”, where n>0, snapshots, and only level-0 goes all the way to tape (or external HDD, or flash storage, or whatever offline media you’ll happen to be using down the road).

Trey Harris is President of LOPSA. His blog comments do not necessarily reflect the views of LOPSA.

Trackback URL for this post:

http://lopsa.org/trackback/1894

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.
Submitted by syscomet on Thu, 2010-02-04 08:09.

There's an unofficial lab widget available for Google Calendar which adds an undelete option.

https://www.google.com/calendar/render?gadgeturl=http://www.stanford.edu/~marmaros/undelete.xml&pli=1

You should get a new box on the right-hand-side with a recycling trash-can in it. You can then choose one of your calendars and it will go look through the deleted events and present a list of them, for you to choose to recover.

There's no guarantees about how long a deleted event will remain available for.

Submitted by trey on Thu, 2010-02-04 08:30.

I appreciate the thought. I left out the fact (which, now that I think about it, I definitely should have mentioned) that, prior to being deleted, all my appointments had their times and durations changed in lossy ways (every appointment was changed to 1:35-1:36am on its respective day). So even an undelete option does nothing for me.

In any case, undelete and backtrack options still can't obviate my point: software errors can cause irrevocable problems to any online mirror or "backup".

Trey Harris is President of LOPSA. His blog comments do not necessarily represent the views of LOPSA.

Submitted by syscomet on Fri, 2010-02-05 04:06.

Absolutely. There's high value in offline backups, where each generation doesn't replace the previous generation, if they're frequent enough to catch the changes so that you don't lose too much data.

Offline backups are great but it's good to not need to rely on them if you can avoid it. The loss of productivity while waiting for restores after disk failures can mean that it's often still worthwhile to use RAID or such, and snapshotting filesystems to let users undelete files without putting load on the backup maintainers are invaluable. The undelete widget for Google Calendar falls somewhat into this class, but without the generational snapshots you get from something like WAFL. In any case, these are amelioration measures, *not* solutions. Yet nor are offline backups, in isolation. So it's worth having a selection of tools at your disposal, knowing the merits and disadvantages of each and how to construct an entire system.

I'd expect offline backups to be part of the solution to most typical scenarios. The most obvious exception being something like a transient mail spool, where most mail should be on the system for less than a minute, such that offline backups just mean that you're fooling yourself and instead the money should be spent on resilience of the online storage and perhaps live replication.