Do you need backup?
And yes that is a serious question. (Given the name of this blog, and my chosen specialization for the last 20 years, it is understandable if you think I was asking it in jest!)
A recent real world situation makes me wonder. It also makes me wonder at what point something becomes a self-evident truth? And are we adequately equipped to answer this question with a logical and defensible answer based on data rather than opinion? Hmmm...
First, a story. One of my favourites in years of IT. And it really happened, about 12 or so years ago.
A certain organization had a badly aging mainframe. It was creaking badly (in the metaphorical sense). The maintenance cost way more than the lease on a newer system. It was bigger than a newer system--taking up too much space. It wouldn't run the most current operating system. And so on. And the only one that noticed was the lead sys admin. When he pointed this out to management--who were apparently surprised by this state of affairs--he asked for funding for a new system. His manager asked him to justify the funding request. The sys admin left, and came in early the next morning to drop off a one page business case for the upgrade. Actually, it was just three words (in 72 point text): "Think about it."
The need for an upgrade was painfully self-evident to the sys admin. But apparently not to the manager.
(And yes, they went ahead with the upgrade. I don't know if a "real" business case was ever prepared.)
So when I think somebody needs backup, I think it is reasonable to also ask: is this a self-evident "truth"? And if so, can that need be justified more thoroughly and specifically than with a simple "think about it"?
Justifying backup is all about understanding risk and cost. Risk of failure, and cost of recovery.
Which is safer: a car or a plane? I know a lot of folks that think driving is a lot safer than flying. And all of them are wrong. Flying is about 50 times safer than driving, when measuring fatalities per passenger mile traveled. For those of you interested in following up that stat, I refer you to the wikipedia page on air safety.
I think that one of the reasons why some people hold cars to be safer than airplanes is because we (people) do a poor job of assessing risk and measuring risk. The car feels safer. It feels unnatural to fly in a giant aluminum tube that weighs hundreds of thousands of pounds.
But if we can't accurately asses risk when it comes to cars and planes, how can we know that it is self evidently true that the mainframe upgrade is required? How can we accurately asses risk of failure, and cost of recovery for IT systems?
Or, at the risk of playing to a stereotype or opening up a cultural divide, why can't non technical managers understand the risk of not doing backups? (Not always, just some of the time. But even some of the time is too much.)
Another organization recently told me that they did not want to do backups for hundreds of TB of data, resident across several hundred hosts. The application was "only" in development, not yet in production, and it was OK if there was no backup. Yikes.
But without an accurate understanding of the cost of recovery (both of an individual system) and the risk of failure, there is really no way to combat the perception that backup isn't needed in this case. Unfortunately, this is not a trivial exercise. The majority of organizations I speak to don't have any good understanding of how much a recovery costs, or what the risk and cost of a failure (data loss event) is. There exists no standard approach to quantifying these metrics; no common understanding; and, many of the elements remain in the realm of soft costs.
In the particular case of the organization with hundreds of TB of data at risk, the metrics and data did not exist to quantify this risk and the costs associated with it. As a result, I had little better response to the problem than saying "well you just need backup!" Why? Because! And it turns out that it is actually a very difficult thing to quantify and justify numerically and logically. (Bearing mind that discussion of ITIL standards, best practices, and so on probably wouldn't carry a lot of weight with an exec running such a project.)
So the question should not be: "do you need backup?" It should be: "how much could it cost you if you don't have backup?"
There is a world of difference between assuming you need to do backup, and justifying it.