Today is a big day. Today something different happened—it is not all just about bigger, better, faster (although there is some of that too, just see my other post!).
Today is the day that EMC is introducing the industry's first disk based long term retention system for backup and archive data: the Data Domain Archiver.
The DD Archiver is a fundamentally different type of Data Domain product. The DD Archiver is a product designed for very long term retention of backup data. Something that, up to this point, has been the more or less exclusive province of tape. Tape has owned this market, relatively unchallenged, for one big reason: economics.
Due almost entirely to reasons of cost, disk has been utilized primarily for storing data with shorter retention periods. I alluded to this in an earlier post: disk with a VTL interface and without deduplication would typically store data for a week or so. Disk with deduplication can economically extend that retention period from a single week to a few weeks or months. But for longer term retentions, tape has continued to be the primary media choice.
The Data Domain Archiver will change that.
The DD Archiver offers a way for data to be retained on disk for long periods of time—many years in some cases—at a cost point that is similar to that of tape. The DD Archiver is cost optimized specifically for the retention of data that needs to be kept for long periods of time.
What kind of data will this be? The typical use case will be for the long term retention of "backup" data. Backup data that has a seven year retention, for example.
And the DD Archiver can store a lot of this data. Up to 28.5 PB in a single system. Given that a single system occupies just two racks, this is a staggering level of information density. That same 28.5 PB would take nearly 18,000 LTO 4 tape cartridges (at 2:1 compression). 18,000 LTO 4 cartidges would normally occupy 3 very large tape libraries, which in turn would require as much as, or more than, two hundred linear feet of rack space in a data centre.
The DD Archiver also offers a high level of performance to store and access information: write performance is up to a maximum of 9.8 TB/hr.
But the most significant aspect of the DD Archiver, besides the very low dollar per TB cost, is the fact that it is a Data Domain system. It is no different in any significant way than any other Data Domain system currently offered. It utilizes the same Data Domain operating system. It leverages the same Data Domain Data Invulnerability Architecture (with a couple of enhancements for data that is to be stored for a very long period of time). It offers the same replication capabilities as any other Data Domain system—and it can therefore be the recipient for data from many smaller Data Domain systems in the field. It offers the ability to apply DD Retention Lock to the resident data, to prevent accidental deletion or destruction. It uses SISL to perform the actual data deduplication, and compresses the data after deduplication for capacity optimization. It utilizes RAID-6 for a further level of data protection.
The DD Archiver also includes some interesting technology to ensure the integrity of data over time. As data ages within the DD Archiver it is moved from an active tier of disk to an archive tier. As the first archive tier fills up, an additional archive tier can be deployed. And so on, until the maximum system capacity is reached. Each of these tiers isolates faults within that tier so that even if something catastrophic were to happen resulting in the partial or complete loss of a tier, the data in the other tiers would not be impacted.
I will have more to say about the DD Archiver in two follow-up posts: the first dealing with TCO and why I think it is fair to make the claim that the TCO of DD Archiver is roughly equivalent to that of tape; the second on some of the more technical aspects of the DD Archiver architecture.
For now, I will conclude by noting that I do think this is the most significant announcement EMC's Backup Recovery Systems division made today. More significant than the DD890 and Data Domain Global Deduplication Array. Why? Simply because that if we have been successful in achieving our goal, if we have been successful at offering a disk system for long term data retention at a cost point that is competitive with tape, that is a very significant achievement, and one that has the potential to change the landscape. Because if you could keep your data cost effectively on disk rather than tape, why wouldn't you?
Integrity of data over a long time was an issue that wasn't addressed properly when it comes to archiving. Great post.
Posted by: email archiving appliance | August 30, 2011 at 02:40 AM
Mr. Appliance;
Data integrity on the Data Domain Archiver is provided by the same exceptionally sound architecture as a regular Data Domain appliance: the Data Invulnerability Architecture. All data is protected in multiple ways, including RAID and data at rest consistency checking, is self-healing, does not propogate errors, and is protected by multiple levels of hashing. The data integrity features of the Data Domain architecture are amongst the strongest and most secure of any storage system of any kind.
Posted by: Scott Waterhouse | August 31, 2011 at 08:52 AM