Apparently the folks over at Sepaton noticed my comments about the deduplication approach of DeltaStor, and have responded with additional information.
First off: welcome to the blogosphere JL. It is great to see Sepaton participate in the conversation. In the hope that you would rather engage in dialogue than devolve the discussion into a vitriolic and hyperbolic FUD-fest a la NetApp (and in the belief that that is probably what my readers would prefer to) I am going to make an earnest attempt to keep this on the level and refer only to the specific claims I made, and the counter-claims. In doing so, we can hopefully see some interesting things about deduplication and return on investment along the way.
The Sepaton post is here for those that are interested.
The firtst claim JL makes is this:
Given that there are at least 5 or 6 common backup applications, each with at least 2 or 3 currently supported versions, and probably 10 or 15 applications, each with at least 2 or 3 currently supported version, the number of combinations approaches a million pretty rapidly
This is a tremendous overstatement. HP’s (and SEPATON’s) implementation of DeltaStor is targeted at the enterprise. If you look at enterprise datacenters there is actually a very small number of backup applications in use. You will typically see NetBackup, TSM and much less frequently Legato; this narrows the scope of testing substantially.
This is not the case in small environments where you see many more applications such as BackupExec, ARCserve and others which is why HP is selling a backup application agnostic product for these environments. (SEPATON is focused on the enterprise and so our solution is not targeted here.)
OK, the text in italics in my (original) post. The following two paragraphs are the response. And he is right. Sort of. What can I say, combinations and permutations were never my strong suit.
A more accurate number would be found by: (number of backup applications) x (number of current supported versions of those applications) x (number of major applications) x (number of current supported versions of those applications) x (backup application agent support).
If I redo the math, we can see that: there are at least four backup applications that I regularly run into with my enterprise customers: TSM, Networker, NetBackup, and OmniBack. There are usually three supported versions of these at any one time. There are at least ten major applications and databases (Oracle, DB2, Exchange, Notes, SAP, SharePoint, Documentum, SQL, etc.). Each has, again, two or three currently supported versions. Finally, for every database backup, you typically have a choice as to whether to run native, through the backup agent, or in conjunction with a third party agent. (And we have excluded minor applications, data bases, and backup applications--even though it would not surprise me to learn that this is 25% of the enterprise market.)
Therefore, the correct math is: 4 x 3 x 10 x 3 x 2 = 720.
So, putting hyperbole aside, the support situation (and just as importantly, the mandate to test every one of those configurations) is a pretty heavy burden. JL also dismisses my argument that "there are many different versions of supported backup applications and many different application modules" as bogus because while "This is true ... it is misleading. The power of backup applications is their ability to maintain tape compatibility across versions." Which is only partially true. They do change. More seriously (and I am trying to avoid FUD as hard as I can here...) I think that certain vendors have a vested interest in their approach to deduplication succeeding, and alternative approaches failing. Would those vendors deliberately change tape format to ensure that? Well, who knows?
At the end of the day, I think you can fairly choose between a deduplication appliance that supports data from any source and solutions that only support data from some sources, some of the time. Everything else being equal, I think we would all choose the former. (And when I say any source, any time, that is more or less true, excluding only odd ball solutions like iSeries--and before anybody gets agitated about that, I know there are hundreds of thousands of iSeries boxes out there, it is just that they rarely get protected by the same backup and restore application and infrastructure as "Open" systems!)
The second major claim JL makes is:
do you want to buy into an architecture that severely limits what you can and cannot deduplicate? Or do you want an architecture that can deduplicate anything?
I would suggest an alternative question: do you want a generic deduplication solution that supports all applications reduces your performance by 90% (200 MB/sec vs 2200 MB/sec) and provides mediocre deduplication ratios or do you want an enterprise focused solution that provides the fastest performance, most scalability, most granular deduplication ratios and is optimized for your backup application??
Again, italics are my original text, the second paragraph is JL's. So... two claims here really: one, general purpose deduplication is slow; two, general purpose deduplication provides better deduplication ratios. (There is a third t0o, that of scalability, but I dealt with that, at least in a tangential way in my post: "How Big is Big Enough?")
Well, "slow" is a relative term. Let me just point out that EMC offers a VTL that deduplicates that can ingest data at 2,200 MB/s. Given that such a VTL would perform roughly as well as 50 LTO3 drives in the real world (because there are a lot of things which contribute to it being difficult to impossible to sustain the rated throughput of a tape drive over 8 hours consistently) I think it is tough to describe that as slow. But, again, it is a subjective term. So your mileage may vary!
With respect to backup ratios: this is interesting. I think the claim here is, more accurately: general purpose deduplication achieves lower deduplication ratios than targeted deduplication such as DeltaStor. I think I have two comments about such a claim:
- Show me the evidence! I have yet to see that targeted, application specific deduplication makes much of a difference. Sometimes it can, but as a general rule, it does not.
- More importantly, I don't think it matters. To make this specific, lets just assume that the claim is accurate. Lets assume that "general" deduplication like EMC's can achieve 25:1 on a given data set. Lets also assume that DeltaStor deduplication achieves 50:1. Twice as good! So how much storage does this save us? About 2 TB per 100 TB of data backed up. That's right, 2%. Two disk drives (using the 1 TB drive sizes currently in our deduplication products). Big deal. The more interesting question that falls out of this is: is it worth it? If I have to accept a limited support matrix, one that is difficult to maintain and keep up to data, would I accept that for a gain of a mere 2%? Finally, I should note that crediting targeted deduplication solutions like DeltaStor with a 2x deduplication advantage is being exceedingly charitable. It is far more likely that this technology will help you gain 10% on your deduplication ratios--but thinking of it in terms of the "worst" case is instructive in that it lets us see how inconsequential additional gains to deduplication ratios are after you get past 20:1 or so.
My conclusion, and, incidentally, the conclusion that EMC engineering has come to over the last several years of dialogue, is that it is not worth it. We are willing to make a small sacrifice in terms of deduplication ratios, in the rare cases where this is actually true, to achieve the benefit of a general purpose deduplication device that can deduplicate any data, from any source, from any backup application, and do it in-line or post-process.
You mention that "EMC offers a VTL that deduplicates that can ingest data at 2,200 MB/s" but you do not mention how fast that data is deduplicated. My understanding is that it's approximately 1/5th of the ingest rate you advertise. Therefore, while you can ingest data at 2200 MB/s, you can only do so for 4-5 hours per day if you want to dedupe it all. Assuming a typical 12-hour backup window and a 24-hour dedupe window, my math puts the "real" ingest rate at less than half of what you mentioned.
Posted by: W. Curtis Preston | July 28, 2008 at 03:19 PM
DL4000 3D performance really has two important dimensions: VTL performance, and deduplication performance. Because the VTL is the only interface that customers will interact with, it is an important one. And it is more than just a cache for out-of-band deduplication, because we allow deduplication to happen on a schedule. Meaning that data written to the VTL might be deduplicated a day later, a week later, or never. It depends on your SLAs, it depends on your restore requirements, and it depends on how long you want to retain the data. So far, in enterprise backup environments (clearly where the system is aimed at) this message has seemed to make sense to the customers I talk to: they do have different requirements for different applications, tiers of storage, and so on.
So, VTL performance is: 2,200 MB/s native. We can actually do a fair bit better than that based on "real world" internal tests, but that is the easily achievable number we choose for marketing purposes. The other important way to describe performance is 1,600 MB/s with hardware compression enabled (and most people do enable it for the capacity benefits).
Finally, the most conservative way to estimate compound performance (when there are simultaneous reads and writes) is that they won't add up to more than 1,600 MB/s with compression on. In truth, they often do--you might be able to get 1,000 MB/s of write at the same time as you get 800 MB/s of read, for example. But to avoid setting unrealistic expectations that this is "easy" or will happen in every circumstance, lets say that aggregate read and write will not add up to more than 1,600 MB/s.
Deduplication performance is 400 MB/s. That can run 24 hours a day, because it is a post-process deduplication. I typically don't recommend a scenario in which it would run for more than 20 hours a day on average, because I want to leave room for future growth, for restore requests, and the like. But at 20 hours per day, that is, roughly, 30 TB per day of deduplication capability.
Posted by: Scott Waterhouse | July 29, 2008 at 11:29 AM
If the device can only dedupe 30 TB a day, then it can only ingest 30 TB a day -- assuming you're going to dedupe it all. Therefore, if you ingest data at 2200 MB/s, you can only do so for 3.8 hours.
Here's my math:
30,000 GB / 2.2 GB = 13636 seconds, or 3.8 hours
But, if you need to use it for 12 hours (a typical backup window), it could only ingest data at about 700 MB/s.
Here's my math:
30,000 GB / 12 hr * 3600 seconds * 1000 MB = 694 MB/s
Yes, I know that the system has the capability to NOT dedupe some data, and that data could/should be excluded from these calculations, but since this system will likely be compared to other dedupe systems, it's important to understand it's "real" throughput number if all data is to be deduped.
One final note: I also think/know that it will take time to move the data from the ingest VTL to the dedupe VTL (due to the 4000's "unique" design), and that this time could also impact the amount of data that can be deduped in a day, but I'm unsure of how to do that.
Posted by: W. Curtis Preston | July 30, 2008 at 11:38 AM
Curtis;
I think you need to distinguish between ingest for the VTL and ingest for the dedup engine that is bonded to the VTL. The device can ingest 2,200/1,600/1,200 MB/s (depending on which perspective). Up to 30 TB/day of this can be deduplicated. The rest can be held for as long as you want (and as the maximum of 675 usable TB in the VTL permits).
Using an intake number of 1,200 MB/s (inclusive of compression and simultaneous write to deduplication engine), that is about 4 TB/hour. So the system can deduplicate in a day about what would be written in an 8 hour backup window.
In reference to your last sentance, all numbers discussed assume that the system is also reading or writing from the VTL--that is, the deduplication engine can ingest data at 400 MB/s irrespective of whatever else may be going on on the VTL.
So what is the real logic of all this? Two things: not everything ingested will deduplicate well (high change rate db, for example). Not everything will be kept for long enough to justify putting it on deduplicated storage. And some things you will want to be able to restore faster than deduplicated storage permits--so you want to leave them on the VTL for that period of time for which you want high speed restore.
Net net? The 4406 3D can ingest more than it can dedup in a day. There is a very big VTL space of 675 TB to write data to that is exclusive of the additional 148 TB of deduplicated storage. But yes, if you wrote at 2,200 MB/s (because you only cared about performance) you could write more to the VTL than the deduplication engine can ingest in a day.
Posted by: Scott Waterhouse | July 31, 2008 at 08:25 AM