Barry Burke may want to riff on Britney Spears in his latest post, but I think I will channel Lucinda Williams instead. Besides being two or three decades older than Ms. Spears, Lucinda is also gifted with wisdom and passion, qualities sadly lacking in little Miss Train-Wreck-In-Slow-Motion-Spears. So, it is Lucinda's words that title this post.
And, I like to think that it is time for a little Peace, Love and Revolution in the world of backup and recovery. Mostly revolution though. The peace and love can come as we, as an industry, work to make things better.
The revolution we need is a re-think of our backup applications.
Because backup applications are not keeping up with what the hardware is capable of. I think that backup applications just haven't done a good job of maximizing the potential of virtual tape or deduplication appliances. Actually, let's be honest: they have done a terrible job of it.
I could go on and on about how virtual tape is better than physical tape, and aside from the issue of cost, I don't think I would get much argument from anybody. It is more reliable, faster to backup to, much faster to recover from, and more flexible.
Deduplication offers some of these benefits too. But it does two more things: it gives us an extremely cost effective way of storing very large amount of backup data, and it gives us an extremely cost effective and low bandwidth way of replicating data to a second site.
However, virtual tape is used in almost exactly the same way as physical tape is. Same processes and procedures. Same concept. I would argue that this is one of the reasons it has been so successful: if you don't want to, you don't have to change much of anything when you move to virtual tape, and your backup environment will just be better.
Fine. I guess I can forgive the application vendors if the applications don't treat virtual tape differently than physical tape, in the main.
But if this happens with deduplication systems, it will be really unfortunate.
Why do we have to work with the same processes and procedures, the same old operational same old, the same architectures, with deduplication systems?
Because backup applications insist on treating them this way.
Why can't I replicate data via the appliance and have my backup application recognize what just happened?
Why can't I create a second copy of the backup data with a different retention period that is strictly pointer-based? (Instead, I have to "dupe" or "vault" data which involves reading it from the appliance, through a media server, and back out to the same or a different device.)
Why can't appliances write a physical tape if required without a lot of intervention or intermediation from the application, and then let the backup application know when it is done?
Let me pick on the first one here in detail. At the moment, when I use deduplication, I have two choices on how to replicate my data to a second site.
And both of them are bad.
The first choice is that I let my backup application do the replication. This is good because it means by application now "knows" that I have a second copy of the data at a second location. But it is bad, because as soon as I replicate data with the application, I have to rehydrate the data before I transmit it to the second site. This means that I lose all the bandwidth savings offered by deduplication. That is really, really bad.
The second choice is that I use by deduplication appliance to replicate the data. So far, so good. However, when I do that, I end up with a second virtual tape at the target set with the same virtual bar code as the tapes at the primary site. And, to make matters worse, my backup application doesn't even know the second copy exists. Which makes it almost impossible to utilize in an ongoing operational process (such as doing restores for QA or test/dev databases) and very difficult to use in a disaster recovery process. This is really, really bad too.
Why can't my backup application either instruct the appliance to replicate, and make an entry into the catalog of retained data? Or, why can't my appliance notify the backup application of the second, remote copy?
And how do I coordinate this with the database used by the backup application to hold its catalog of data protected, virtual and physical tapes, and so on?
Answer: at the moment, I can't.
More precise answer: I can, but it is an enormous pain, and it means constructing something like consistency groups that involve both the disk which stores the backup application database, and the disk which holds either the virtual tape, or the deduplicated data.
Which is crazy difficult, expensive, and explains why almost nobody does this.
There has to be a better way. There is a better way. But we need the backup application vendors to step up. We need them to admit that the hardware is now capable of much more than the backup applications. We need them to realize that we can do things better.
Can you imagine how easy backup would be if I could backup up to a virtual tape appliance. Deduplicate the data. Have my backup application instruct the appliance to replicate. Transmit the deduplicated data (only). And instruct a remote backup server that it now has a copy of the data, to do with as it wishes. And informs it of the appropriate database and catalog entries for that data. And checkpoints them with the replicated data.
There is a synergy in making deduplication and virtual tape work in harmony with the backup application. We, as an industry, and we, as backup administrators, operators, and architects, NEED that synergy.
We need the revolution. We need things to be better.
It is not every day that a real opportunity to make things better comes along. Virtual tape and deduplication have given backup application developers that opportunity.
To waste the opportunity would be a tremendous shame.
So it is time for a little out of the box thinking. It is time to make the world of backup a better place.
It is time for a little peace, love and revolution.
P.S. This is as vendor agnostic a post as I will ever make. My plea goes out to each and every backup application vendor: EMC, IBM, Symantec, and all the rest of them. As far as I know, none of them comes remotely close to getting it right. And it is about time that somebody did.