I was doing a presentation last week and happened to be discussing the benefits of deduplication on disk with a group of customers. I made the comment at one point that if you are a C-level executive who doesn't know a lot about backup, and probably doesn't care a lot about the technical aspects of backup, there are a couple things that make it easy to understand why backup to disk with deduplication is such a good thing.
The first is that you can take a data centre floor full of tape and stick all that data on a couple racks worth of disk. Take 12,000 tapes and put all that data on less than 400 disk drives. Literally conslidating some 800 square feet of infrastructure into 10 or 20 square feet. That is an easy visual I think--and what that speaks to the level of consolidation that is possible. Much like server consolidation, this approach can offer some truly dramatic savings in space consumption.
I went on to comment that as a result, for short and medium term backups, not only had data deduplication products become the technology of choice over tape for the majority of customers, it was doing so at a price point that was roughly equivalent to tape.
And at that point one gentleman raised his hand and said something like: "hey I am one of those C-level folks you were just talking about and I don't know much about backup other than as a line item in my budget--so tell me: is disk deduplication really cost equivalent to tape or is that just a line?"
I replied as honestly as I knew how and told him that it was absolutely not just a line. And I gave him some perspective on this.
Over the last four or five years I think a reasonable consensus has emerged regarding the cost of disk based backup.
Initially, when all we had were VTLs without deduplication it was very difficult to justify disk backup with more than one or two weeks of retention. It was too expensive and tape was just so much less expensive for longer retentions (as the cost of the disk capacity and infrastructure rapidly added up with increased retention periods). Nevertheless, virtual tape libraries were very popular as they solved many of the problems associated with tape: reliability, performance, speed of recovery, manageability, and so on. As a result, not too many people had a problem cost justifying VTLs, as long as only one or two weeks of data were going to be retained.
And that was the state of affairs for a couple of years.
Then we added deduplication to the mix, and there was another change. The second wave of widespread adoption of disk as a backup target began. Disk based targets became cost justifiable for retention periods of, on average, three to six months. Perhaps as much as a year in some cases. Beyond that, the cost advantages of tape again become too significant to ignore. And as a result, in most cases, this longer term retention remains the domain of tape. But for shorter periods, disk with deduplication gives us all the benefits of disk: including reliability, inexpensive replication.
Fair enough.
The final part of my answer dealt with the one major assumption I make when I say that disk is cost competitive for short and medium term retention: that when we do the TCO analysis you are considering an upgrade or replacement of your tape based infrastructure in the next 2 or 3 years. Because if you are not, and your current infrastructure is meeting your needs then it becomes very difficult to cost justify buying something else. (Notice that I said cost justify. There are still a lot of other good reasons to move to disk than just cost: better reliability, faster backups, better DR, lower operational effort, and so on).
But assuming that at some point you are going to outgrow your current tape environment then I think it is pretty easy to justify the move to disk with deduplication.
So with that said I think it is very reasonable to conclude that you can deploy disk with deduplication for backup at a total cost of ownership that is equivalent to tape or less.
Which leaves tape only the use case of the remaining very long term retentions which are not cost justifiable on disk. For now, anyway.
Comments