« NetWorker Goes Virtual | Main | DL3D Adds OST Support »

March 03, 2009


Feed You can follow this conversation by subscribing to the comment feed for this post.


Hi Scott,

Would you mind clarifying something? Using your figures, if the 4406 took in 64TB in 8 hours, can only dedupe 48TB in 24 hours, what happens to the remaining data thats not deduped?

Or is that not a real scenario, since the dedupe engine would eventually catch up on the following days (unless a customer has an unreal change rate of 100%)

Scott Waterhouse

A DL4406 can do 8 TB/hr or 64 TB in 8 hours.

The deduplication engines can do 3 TB/hr, or 48 TB in 16 hours.

Two scenarios could happen:

1) You could let the deduplication process catch up during the next 8 hour backup window. 3 TB/hr over 24 hours means we can deduplicate 72 TB in 24 hours. (Truthfully, that will impact your backup window to a small degree in the sense that if a DL4406 is both ingesting data and migrating it through the deduplication engine, this will slow down ingest by a small amount.)

2) If I picked a backup window of 6 and a half hours, then I could back up 52 TB and deduplicate 52.5 TB, with absolutely no overlap in the processes.

I picked the 8 hour window to illustrate the point that if you really want to get a lot of data "committed" to your backup target, a DL4406 lets you do that and then deduplicate at a more leisurely pace in a way that doesn't impair your ability to meet your backup window.

Craig Frasa


When you reference an onboard media server are you talking about running a product like Gresham Distributape or TSM Media manager?


Scott Waterhouse


In this case I was thinking specifically of a NetBackup Media Server or a NetWorker Storage Node. The Gresham idea is an interesting one (not currently supported and I suspect that has to do with how much demand there is for it); the TSM also makes sense--however I don't think IBM has any enthusiasm at all for the idea of running TSM code on an EMC device.


In a well managed backup environment, a backup admin will attempt to normalize the backup workload. For example, this means not doing all his Full Backups on one particular day, but implementing a rolling full backup strategy. Hopefully he ends up with a fairly steady daily workload without many peaks and valleys.

How this relates to the 3D4000, is that you really can't size these based upon the front end ingestion, but the back end, otherwise you end up filling up the front end faster than you can destage to the de-dedupe on the back-end and you run out of disk space.

I also think it's foolish to size based upon the ability to de-dupe on a 24hr window... at least I don't see how you can do that and maintain the same performance numbers .. something has to give if you're hammering the disk from the front end while trying to dedupe it. It's a safe bet that they're going to step on each other's toes and at least and will block each other to some extent.

Having said all of that, I do see value in having a non-dedupe front end and a dedupe back end. Most critical restores - the ones where you've suffered a major loss i.e. server loss, file system loss, etc.. will not be asking for a restore from 2 weeks ago.. They want the data as it looked 2 minutes ago.

On the other hand, the restores that ask from data from 2 weeks ago are typically the less critical in terms of SLA - typically single file stuff.. hey.. if it took you two weeks to figure out you were missing something, it's probably not critical.. Not everything falls in this bucket.. but that's been my typical experience.

Scott Waterhouse


Couldn't agree more with just about everything.

1) As far as filling up the front end and balancing that with back end deduplication capacity, see my comment above. You are right to suggest that other activities do impact the ability to ingest (either native or deduplicated data).

2) For sizing on a 24 hour window, I agree. The timing as I described it assumes you have exactly 8 hours to backup, and exactly 16 to dedup. But what about restores? What about a spike in data from a particular client? What about organic growth--how big will your backups be in 6 months or a year? So start with these numbers, and then factor in the impact of everything else, and that is your sizing. You are right to suggest that sizing against the ideal would be not such a good idea (near suicidal really, for both backup admin and vendor). All these factors have to be accounted for and if the growth takes you beyond one unit, the scaling model and migration plan (if needed) needed to be accounted for too.

How fast have you grown in the past? Do you think that will keep up? What amount of bandwidth do you want to reserve for restore? How about replication? Do you have a maintenance window? Lots of things to think about.

W. Curtis Preston

Responding to this is going to take me a while. I just want you to know that I reviewed it, and will probably post a really long response in my blog at http://www.backupcentral.com when I'm ready. (This little tiny window you give me here is not conducive to long posts.)

Scott Waterhouse

That's OK, it took me a while to write it!

And I know the little itty bitty pint sized window is awful. Cut and paste is your friend. :) Usually I have to compose my respones in notepad or word and then copy them. Sorry--but blame typepad not me!

W. Curtis Preston

I also agree mostly with what NK, but would like to offer a different perspective on a few things.

"you really can't size these based upon the front end ingestion, but the back end"

I agree. That's why I harp on the performance of the back end dedupe engine more than the performance of the front end ingest. I'm not saying that initial backup speed isn't important. I don't care if you can do 50 TB/hr, if you can't dedupe it, I don't really see that as a "dedupe solution." I see it as a VTL that can dedupe a small amount of its data, and only deduping a small amount isn't worth it.

"I also think it's foolish to size based upon the ability to de-dupe on a 24hr window"

I agree, but not in the way that I think NK means. I think 24 hours is too long. I say that dedupe should finish shortly after the backup window, which actually I think is the opposite of what NK is saying.

"I don't see how you can do that and maintain the same performance numbers"

But if you could, you don't see a problem with getting it done quicker, right? There ARE products that are doing this. I just don't think that the 4000 is one of them.

"Most critical restores ... want the data as it looked 2 minutes ago. On the other hand, the restores that ask from data from 2 weeks ago are typically the less critical."

Agreed, which is why I don't harp TOO much on the performance of data from two weeks ago or six months ago, but a restore from yesterday better work. NK's comment and Scott's whole post seems to presuppose that this can only be done by restoring from undeduped data; I know for a fact that this is not the case. I know of several solutions that offer restore performance very close to (and sometimes greater than) write performance without keeping the data in its original format (either in a cache, the way the 1500/3000 do or by having a separate box the way the 4000 does). NK says he doesn't see how it can be done; Scott says it can't be done. I've seen it -- just not from EMC or Quantum yet. (I do have hope for the future, but I'm talking about today's products.)

Scott Waterhouse

"I think that dedpu should finish shortly after the backup window" Why? Honest question. There are still 16 hours left in the day. Having resources doing nothing for 16 hours is so... so... so much like tape!

"There are products doing this" Who? Even Sepaton, who likes to claim some sort of performance high ground can only do 1 TB/hr per node. In 8 hours, 5 nodes that is 40 TB. At 3 TB/hr in a DL4406 3D we can do the same out of the backup window in 13.3 hours. So again, the honest question is who can do this in the window?

As far as pre-supposing that it be done from undeduped data I don't think that is the case. Again, look at it from an SLA stance: how fast do you need. A DL4406 3D can offer up to 1,600 MB/s on restore. For as many days as you could reasonably want. That is significantly faster than any other alternative. And your ability to do that, and for how long, can be directly tied to your SLA objectives--again, something nobody else offers. (Sorry but I am going to be from Missouri on Sepaton's claims. JL has directly contradicted himself. So I am reserving judgement in the absence of real evidence. As for any other vendor, nobody even comes close to 1,600 MB/s.)

And finally: what "separate box"? You have one set of fibre connections. One management interface. One set of resources visible to the backup application. How does that equal two boxes? Do we get to call Sepaton 5 boxes? Or 20 (counting storage processors)? A DL4406 3D is one box, by any reasonable standard.

W. Curtis Preston

Of course, true inline vendors do this today (finish dedupe right after the backup). I would also say that SEPATON and Falconstor can do it up to a point. For example, if each SEPATON node can dedupe 300 MB/s, but ingest 600 MB/s, what if you had one ingest node and two dedupe nodes? (But since they only support a four node cluster today, you can't go very far with that design.) With Falconstor, their ingest speed is 1500 MB/s per node and their dedupe speed is 400 MB/s per SIR node. Therefore, to do it with Falconstor, you'd need one ingest node and 3-4 SIR nodes. They support an eight node cluster, so you can't go very far with that design either. But these configs do work because they both support global dedupe, which you and Quantum do not yet support.

As to the separate box, I am referring to having a completely separate VTL from a completely separate vendor with completely separate storage (the front end ingest box) sitting in front of another VTL with its own storage (the back end dedupe box). This is not analogous to a front end and back end SEPATON or Falconstor node, because they share storage and don't have to move the data the way the 4000 does.

As to my FULL response to this post about the 4000, I'm still working on it. But I have posted performance numbers for all major dedupe vendors in this blog post:


This is the world as I see it, and the way I see it, EMC's 4000 is in the back -- not in the front.

Scott Waterhouse

So basically if you use a Falconstor cluster, and apply this standard, you can't scale beyond 5 nodes (because adding another ingest node overwhelms the dedup nodes). And Sepaton doesn't really scale either, because they only support 4 nodes. So I repeat my question: why finish dedup in the backup window? All that this data shows is that if you want to dedup in the window, you have to accept a performance penalty.

As far as the separate box, my characterization is 100% accurate: one interface; one view to administrate; one set of connections to the SAN; one view of storage to the backup application. There are not two boxes. Data may move between different pools of storage, but that is transparent to the application and end user. It is not one VTL sitting in front of another, it is one VTL period. Show me the second VTL from a console. One system.

And honestly, I am not that impressed with the performance analysis. You cite numbers for DD that pertain in, what, 1% of the use cases? People with both OST and 10 GigE. Without that, maximum performance is not 25% less, it is 90% less: 400 MB/s per DD690. EMC's top end box, the DL4406 3D, does 2,200 MB/s ingest and 800 MB/s dedup. If you are willing to consider clustered speeds for others, why not EMC? I think you excuse this by saying it is not a global dedup pool?

Depending on your perspective, neither is Diligent nor Sepaton (I don't think). Correct me if I am wrong, but because these two dedup at an object and backup client level, if you have the same object, with a different name on a different backup client, that will not be recognized as common (and therefore deduplicated) be either system. To understand that the "issue" you identify is really not, we just need to realize that more than 90% of deduplication comes from deduplicated data from the same client over repeated full backups. The EMC system will always have the data from a given backup client going to the same dedup node--because a given client will always use the same pool of virtual tape resources. So which approach is "better"? One that can't recognize the same object from a different location--even if it shares the same pool of virtual tape for a target--or one that can? Seems to be the difference is likely to be both trivial, and non-decisive in the sense that this issue is so minor that there are a dozen other things that will contribute to a customer decision before this.

There are a lot of games being played with that performance table, unfortunately.

W. Curtis Preston

To the first part of your comment... First, let me get to the heart of the question, which is "why finish dedup in the backup window?" I'm not so much pushing finishing it in the backup window as much as I'm saying you HAVE to finish it within 24 hours, and you SHOULD finish it as fast as you can. Why would you want to finish it as close to the backup window as possible? Because the sooner data is deduped, the sooner it gets replicated. The sooner it gets replicated, the better RPO you have.

Let's forget dedupe and disk for the moment. The best people did with tape was to back up the night before, then come in the next morning, take the tapes out and hand them to "the guy with the truck." Your backups are now offsite by, say, 10 am.

Let's compare that to dedupe with a 24-hour window. If I allow my backups to take 24 hours to dedupe, they won't get replicated for 24 hours. That means that my backup that used to go offsite at 10 am on Tuesday isn't "going offsite" until 8 am on Wednesday. That's kind of a big deal.

Soooo.... If you CAN finish within the backup window, allowing you to replicate much quicker, wouldn't that be a good thing?

As to the number I gave about different vendors, this is really difficult to discuss without a whiteboard, but let me try. Falconstor actually supports 8 nodes and SEPATON supports 5 nodes. But to get the dedupe done within the backup window (if that's what you want), you need to dedicate some of them to dedupe only. I proposed a 1:2 ratio (ingest nodes to dedupe nodes) for SEPATON and a 1:4 ratio for Falconstor. When you do that ratio (which is only necessary if you want to do what I'm suggesting), you would need 6 nodes with SEPATON and 10 nodes with Falconstor to double your throughput, and neither of them support that today. SO, to go back to my original statement, I said you COULD do it, but the performance would be limited today.

BTW, them supporting only 5 or 8 nodes today does not equate to "they can't support more than that." It simply means that's all they've qualified. Neither of them want to say they can support more nodes unless they've qualified it in their QA department. So they qual'd 5 and 8 respectively, and are now qual'ing bigger configs. No code change -- just bigger config in the QA lab.

As to the "there are not two boxes," comment, are you saying that the Falconstor-based front end ingest node runs on the same physical CPU as the Quantum-based back end dedupe node? I don't think so; I believe (although I have to admit I haven't physically verified) that those two pieces of software run on two entirely different physical systems. The fact that I only have to interact with one does not make the other one disappear. Using your logic, I could say my Prius doesn't have a giant battery in the trunk that your car doesn't have because all I interact with is the steering wheel. And as I said, these two physical nodes are NOT analogous to an ingest node and dedupe node in SEPATON or FalconStor, as those nodes share storage; your front end and back end nodes do NOT see even see each other's storage. The only way they have of passing data back and forth is to literally copy it via from one node to another.

I already responded to the rest of your comment in my blog entry http://www.backupcentral.com/content/view/229/47/ , since it's the same comment you put there.

Scott Waterhouse

Well your tape analogy is accurate. Except for the fact that a lot of folks finished their tape backup about 7 or 8 in the morning, and it was well past the time that the courier left before they finished copying that for off siting. So it was usually a n-1 copy that made if off site. The other prevalent scenario was they took the backup tapes and moved them off site and left nothing on site. Both were bad and had huge drawbacks, but both got done because the ideal way (which you describe) was expensive. And rarely achieved, in my experience.

So, to answer your question: if you could do it (dedup, replicate, and make the window) it would be a good thing. I think the implicit assumption in my line of reasoning is that there aren't a lot (any) systems out there that have the speed to do this and stick to your backup window. With one system. Given that, you have two choices: buy a bunch of systems (as you could do with DL3000s) or buy one and at least get your backup done in the window. Again, trade-offs. But I think I have been pretty consistent in saying that you get to choose which your priority is: window and speed; or replication and multiple units.

And since Sepaton doesn't, in fact, replicate, that leaves us with FalconStor??? All due respect to them, I see their dedup solution (as distinct from their VTL solution) as more or less a non-starter in the marketplace. I won't speculate as to why, but in my geo they are basically non-existent.

So, back to my thesis: if you want to commit to your backup window, and you have a lot of data, then you have limited choices. If you think it is reasonable to do dedup after the fact (as I do, acknowledging that you may not want to make this choice in favour of replication instead) then I think the DL4000 is an even better choice in an even smaller field.

Finally, who cares about all the back end nonsense? If VMware has taught us anything, hasn't it taught us that? Along with SVC, virtual pools of storage, thin provisioning, Invista, Centera, Avamar grids, and every other technology that presents a single view of multiple resources that we therefore feel comfortable calling "one system". One view to everything means one system to me.

Brad Jensen

My comment is that ALL of your storage servers have IBM I (the most recent name for iSeries, System I , or AS/400) connectivity, if they support a Windows network file share. A lot of the former VTL-interface only systems are adding this, such as Falconstor.

While most of our current customers and prospects in this area are Data Domain users, basically any storage system that supports NFS or CIFS can be connected as a backup destination to a iSeries /System I/IBM I/ AS/400 using a Windows server. No VTL, no BRMS, no buggy whips.

The comments to this entry are closed.

Search The Backup Blog

  • Search