« Overtake The Future | Main | V-Max and Backup? »

April 02, 2009

Comments

Feed You can follow this conversation by subscribing to the comment feed for this post.

NK

Scott,

I think you made some good points here.. but here are my 2 cents..


First off.. I don't know about Curtis.. but while I believe you don't have to de-dupe within the backup window - you do have to backup, dedupe and offsite within 24hrs.

In a real tape shop, I would never.. EVER.. send my primary tapes offsite. I always want them in the library/onsite. The second copy is not only my DR copy, but it serves as media protection for my primary.. so that if I have a problem reading from my primary, I can get data from my secondary. This can happen whether your primary media is disk or tape - I've had bad tapes and I've had bad disks.. However having said that, if the DL4000 can give me onsite and offsite copies for less cost, then I'm totally with you.

Are you absolutely sure about the EMC 2200 / 800 number. I hear conflicting stories in the field about that.. mostly that this is a future number? i.e. That today whether you have a 4406 or 4106 as the front end (1 or 2 nodes/1100 or 2200 front end MB/s), that EMC only supports 1 node on the back end for de-dupe (400MB/s not 800MB/s). What I hear is that a 4406 with two de-dupe nodes on the backend is in plan, but not yet available.. but hey.. I don't work for EMC, so I'll leave that up to you to clarify.

Regarding the ability to replicate and de-dupe concurrently.. Interesting.. Can you explain in more detail how this works? I always assumed that DL replication occurred on a volume by volume basis in a sequential fashion.. but when you're performing de-duplication I would imagine that the source volumes would be constantly changing.. but if the replication occurs in a serial fashion (volume by volume), then won't there be processes stamping on each other (volume de-duping (changing) while it is replicating? or data being replicated over and over again (keeps changing) until the de-dupe is totally completed? Perhaps I have misunderstood something.

I think replication is a big one.. but I haven't really seen any good performance numbers from anyone (not just EMC).. especially in terms of their ability to "fill the pipe" and deal with latency/distance and packet loss - things that happen in the real world, but are often overlooked in labs/testing. Any comment on that? TIA and thanks for your other informative blog posts.

Scott Waterhouse

NK;

Absolutely you have to backup, dedup, and offsite within 24 hours. However, my thesis here is that for some (most?) folks, it is acceptable to backup within a 6-8 window, and then take some additional amount of time to deduplicate and replicate. As long as that time doesn't spill into your next backup window, that is OK. Actually, I would prefer that all 3 activities not consume more than 20 hours or so to leave time for restore, maintenance, or whatever else. This is the strength of the DL4000--fast backup when you are OK with letting deduplication and replication spill over the backup window.

As for sending primary tapes offsite? I agree! 100% Unfortunately, it is my observation that while it may be a big breach of best practices, it happens quite a bit. I see a lot of organizations doing just this.

Yes we do support two nodes on the back-end.

You are correct in that deduplication processing happens on a volume by volume basis sequentially (although more than one volume can be "processed" simultaneously). Basically as this happens net new chunks of data (the stuff that remains after deduplication) are transmitted to the remote box. I guess that is a bit of an over-simplification, but it is close enough. Let me mark "replication explained" as a topic for a future post.


Scott Waterhouse

And the rest of the comment:

Finally, with respect to replication and pipesize. We do have tools to assess the size and latency, and determine if what you have is sufficient to the needs of replicating a deduplicated data set. I think we don't publish numbers, because they are actually not terribly meaningful outside of the context of our tool. Here is what I mean by that:

- Say you ingest 1 TB. There will be a certain amount of meta data associated with that which needs to be replicated.

- Say your change rate is 0.005%. There will be very little "new" data which needs to be replicated.

- On the other hand, say your change rate is 95%. There will be a great deal of "new" data to replicate.

Our sizing tool will mash those two factors up, and looking at bandwidth and latency, and give you an answer. But unless you know how much meta data is going to be transmitted (and honestly, I have no idea what factor that would be of the original source data) and the change rate, simply saying we replicate at x MB/s is pretty meaningless.

W. Curtis Preston

Scott,

I’ll try these one at a time. They may or may not be in the same order that you posted them in the post.

“Curtis keeps insisting that you deduplicate within the backup window. Because he also insists that you have to finish your off-site process within that window. Which, if I can be blunt, is nonsense. Very few (no?) organizations both make a primary backup copy, and an off-site copy, and get it to an off-site storage facility within a backup window.”

Organizations that are properly designed do it all the time. Do I find such organizations much? No. Have I configured many such organizations? Absolutely. It starts with having enough throughput to get the job done quickly (as in something far less than 12 hours). It also often includes the concept of Inline Tape Copy (a NetBackup feature that creates an original and copy at the same time). While ITC does make the backup take longer (from 10-40% longer depending on a number of factors), both copies are done at the end and it’s still way faster than the traditional way (backup to tape, copy to another tape).

Suffice it to say that it is totally possible. I will concede that it is very uncommon. But it’s also very uncommon for customers to stream their tape drives. (I’d say less than 10% of customers I’ve visited were doing it before I showed them how.) Does that mean they shouldn’t stream their tape drives?

“Far more prevalent is one of two other scenarios: you make a backup within the window, give it to a courier and it goes off-site, and it is at the storage facility by sometime that afternoon.”

I completely agree that this scenario is very common and it is as wrong as wrong can be – for the reasons you stated and many more. But I will agree that it is common.

“The other scenario is that you make a copy of your on-site data to tape, and ship that off-site. But because all the tape drives necessary to finish that process by early morning are really expensive, people are more relaxed, and are willing to take business hours to finish. Meaning the off-site copy only goes off the next business day. “

I’ve also run into this and it’s even MORE wrong than the previous scenario. Some data changes at 8 AM on Monday. It’s backed up to tape Monday night (18 hours from the change), and the copy process starts Tuesday morning (24 hours) later and the copies don’t leave the site until 24 hours later (48 hours). So unless the BUSINESS REQUIREMENTS say that a 48 hour RPO is acceptable (and I’ve rarely met a company where they would), then this scenario is not meeting the business requirements and should be discouraged and done away with, no matter how common it is.

I also say baloney to the “tape drives are really expensive” bit, too. I am currently in the middle of a large pricing exercise for some upcoming blog posts. You know that I’m a huge fan of VTLs/IDTs/dedupe, etc, but tape still wins in the MB/s/$ race. I’ve given several vendors (including yours) a hypothetical company with 150 TB of data to back up and replicate. The disk-based configs are all above a million dollars per location. A tape library big enough to hold the same data and fast enough to back up the same data costs about $500K to buy and another $400K to fill up with tape. OK, so that’s $1M+ for disk and $900K or so for tape. Not bad. But what happens if I ask each of them to double their throughput? Most of the disk systems double in price because they need both disk and system heads to increase their throughput. The tape system, though, only needs 5 more LTO-4 tape drives -- at a cost of $20-$40K. People don't do this because it costs too much. They do it because they don't know how, and too many people think it's not possible.

I have run into very few customers where I was not able to meet the 12-hr requirement by doing one or both of two things: helping them stream their tape drives better (giving them much more throughput than they thought they had) or spending just a little bit more on a few tape drives (once we already streamed the ones they had). It really doesn’t cost that much.
Is it rare that customers do this? Absolutely. Does that make your argument that it’s not important that they do that? Definitely not.

On to the cost element: “As far as the cost component: again, lets be blunt: Curtis is just speculating. He is saying, basically, because it has parts x, y, and z, and the other vendors only have parts x and y, it must be more expensive. Well, based on my observation, the platform is price competitive with other large scale VTL platforms. Never mind what Curtis reckons it should cost, what does it cost?”

I am not speculating. I am reading LIST PRICES THAT EMC PUBLISHES ON THEIR WEBSITE. I know what an EDL 4000 costs and I know what a 3D 3000 costs (list price, of course). So when you put them together and call them a 3D 4000, I know what that “costs,” too. Now, do I _also_ know that you are currently selling this and your other target dedupe systems for substantially lower than those list prices? Absolutely. Now that EMC has something to sell in the target dedupe space, they have adopted a “refuse to lose” mentality and are selling their systems at next to nothing to make the deal at any cost.

But even though EMC will sell it at next to nothing, the annual maintenance contract will be based on EMC’s real cost of maintaining that equipment – and maintaining more equipment costs more than maintaining less equipment. Therefore the maintenance contract to the customer will inevitably cost more than a competing solution if you do indeed have more hardware to maintain.

In addition, I have seen what EMC does with a customer when the severely discount something like this. They give you a 90% discount on product A, but your overall discount for the rest of the year drops. EMC has a margin level they want to maintain across the board and doesn’t give anything away. If they don’t make the margin on the 3D 4000 deal (which there’s no way they are if it’s not costing more than competing solutions), then they’re making it up somewhere else.
And if EMC needs two heads instead of one and needs two separate pools of disk instead of one, then the 3D 4000 will also consume more power and cooling than competing solution. And many companies I’ve worked with say that power and cooling is KING. (They can’t get any more.)

Finally, I believe that the underlying difference in complexity between the 3D 4000 and the 3D 3000 will ultimately result in a different level of manageability on the part of the customer – no matter how hard you try to hide that complexity from the customer. (And the interactions that I’ve had with your customer base are bearing this out.)
In summary, it costs more to build (even if you don’t pass on that cost directly to the customer), it costs more to power and cool, and it will cost more to maintain.

“So when Curtis asks why someone would buy it over a competing solution, I can answer: because it is faster than competitive solutions, offers deduplication and replication, is supported by a single vendor, and does so at a competitive price point.”
Except it’s not faster (more on that later) and is “supported” by two vendors, one of which needed a $100M loan and a $40M advance payment on royalties to keep operating.

As to points 2, 3, and 4… The only reason I wrote them were that they were counter-points to Mark Twomey’s (storagezilla) comments that there wasn’t a front-end/backend, that there was no cache, and that data never existed in two places. First, I believe all of them matter for reasons I just stated, and all of them matter because they show that my initial description of the 3D 4000 was correct and Mark Twomey’s comments were not.

“Curtis writes: "3D 4000 appliance does not support Path to Tape feature” and that the "3D 4000 does not support either support [sic] NAS backup or NAS sharing." ... We don't support path to tape or OST on the DL4000 because we have a superior methodology: we support Media Servers and Storage Nodes directly on the DL4000 DL Engine. Both offer superior functionality to path to tape.”

The embedded storage node/media server feature is SO not superior to the path-to-tape feature for two reasons. The first reason is that direct copy of tapes by VTLs is faster than cloning/duplicating them. The performance of cloning/duplicating (which is what you would do with the embedded storage node feature) is highly susceptible to the structure of the data. Big databases clone/duplicate quickly, filesystems less quickly, and dense filesystems really slowly. This has to do with how much back and forth talking has to happen when copying the data. VTL/IDT tape copying is always the same speed regardless of the data because it’s just a block copy. And if you support OST or the NDMP-direct-to-tape feature, your tape copy is also understood by the backup software.

And if we’re talking NetWorker, the emdedded storage node feature is hampered by the fact that NetWorker still doesn’t have an automated way of cloning backups for large organizations. Group-level cloning only works for the smallest of organizations, and anything else means shell scripting. I’m not saying it doesn’t work; I am saying it is significantly more complicated (and therefore not superior) to VTL-level tape copying.

“Curtis writes: "But other post-process dedupe systems (Exagrid, Quantum, SEPATON) are different. You can change your mind any time you want to as to how much data is stored in deduped format and how much is stored in native format."

Quantum, Exagrid, and SEPATON have the ability to dedupe some data and not dedupe other data. In Quantum you do it by tape pool; with Exagrid and SEPATON you do it by policy (e.g Dedupe Exchange; don't dedupe redo logs). You can therefore change your mind later and increase and decrease the data that you're deduping or not without moving any disks around.

“Curtis writes of the performance of a DL4000: "You can have two nodes on the front, but they must share a single node in the back if you're to get global dedupe. You can also put two dedupe nodes in the back, but they don't have global dedupe. So you get either 1100 or 2200 MB/s, but always 400 MB/s in the back. "

This is just not correct. You can have a system that does 1100 MB/s and 400 MB/s. Or you can have a system that does 2200 MB/s and 400 MB/s. Or you can have a system that does 2200 MB/s and 800 MB/s. Your choice. “

I’m going to say this again. Those two dedupe nodes in the back of the 3DL 4000 are just that – two dedupe nodes. You cannot add their numbers together and call them 800 MB/s any more than Data Domain can add their 16 nodes in their DDX “array” and call it a system that does 6400 MB/s. They don't share storage. They don't share backups. They are two nodes that are in the same rack.

If you continue to call these two 400 MB/s systems one 800 MB/s, then you have to say that DD has a 6400 MB/s "system." Take your pick.

“Curtis writes of the DD690 that it offers performance of 750 MB/s on ingest and deduplication, and notes this is "approximately" 25% less if you don't have OST.

Actually, it is 400 MB/s if you don't have OST AND 10 Gigabit Ethernet. “

You’re right. I had been given bad info by my Data Domain contact. It is only 400 MB/s today.

Now let’s talk about point 9 and 10. You dismiss SEPATON as they don’t have replication. I’ll give you that for now (it’s coming really soon). You also dismiss Falconstor because you haven’t seen it. You haven’t seen any of these guys other than your own have you? Well, I have talked to at least one Falconstor customer that has a 4-node system deployed and the numbers are around what FalconStor is advertising. So they’re completely made up. You use Data Domain’s lesser number of 400, since that represents the bulk of their customer base (although 10% of their customers licensed OST last quarter).

Then you keep adding your two 400 MB/s systems together and calling them one 800 MB/s system just because they’re in the same rack. (They don’t share backups; they don’t share storage; they only share a rack.)

So if we take out SEPATON, use the best Falconstor numbers that I’ve been able to verify, use DD’s non-OST number, and use the 400 MB/s that we really should use for your system, we still have this:

• EMC: 2200 / 400
• Data Domain: 400/400
• FalconStor/Sun 5500/1600
• IBM/Diligent 900 / 900
• Quantum/Dell 880 / 500

You're still not winning the competition, regardless of which column you look at.

“As far as it requiring twice as much storage? Well, any post process system is going to require native capacity and deduplication capacity. So EMC is no different than any other vendor in this respect. And therefore, Curtis is simply incorrect to say that we require "twice as much storage" as our competitors. “

The 3D 3000 requires the block pool and the native copy. Right? That’s the difference between you (and other post-process vendors) and the inline vendor, right? And then you plug that into the back of an EDL that has all its own storage, which it is NOT sharing with the 3D 3000. That’s a whole bunch more storage that NO ONE ELSE NEEDS. It might not be twice as much, but it’s a whole bunch of disk that no other vendor needs and that disk isn’t free to buy and it isn’t free to maintain.

We may require some additional storage as compared to in-line only deduplication vendors (where "some" is equal to your largest daily backup") but there are distinct benefits to this too--primarily backup and restore performance.
As point 10, I am not assuming that replication doesn’t begin until dedupe ends. I am saying that until that last bit is deduped, it won’t be replicated. That means that if a backup finished at 8 AM and you take 20 hours to deduplicate it, some of that backup won’t be deduplicated until 4 AM, which means that it won’t get replicated until 4 AM. And 4 AM tomorrow is a lot later than 9 AM this morning. (As to whether or not I can do 9 AM this morning with tape, I point you back to the beginning of my comment.)

Finally, you said:

“If that just isn't good enough: if you need to have your replication done by 9 AM, then you need to deduplicate in-line. Not only will a DL4000 not meet this requirement, neither will any other system from any other vendor that I am aware of.”

Falconstor can do it today. And when the currently beta SEPATON replication software goes GA, it will do it too.

W. Curtis Preston

BTW, SEPATON announced last week that their replication finally goes GA May 30.

Scott Waterhouse

Finally.

We will see how many caveats on size and scalability accompany it (just like there are caveats on the dedup function).

Jack

Those who can...do

Those who can't ....teach

Those who can do neither....blog

Those who have no backup issues...use SAM-QFS

Scott Waterhouse

Really Jack?

Would that be the same filesystem that caused days of outage and ultimately data loss for a service provider in SF? The same one that is littered with months old bugs on its support list? The same one with an uncertain future now that Oracle owns it?

Yes it would.

Seriously. Not only was your post insulting, it was spam. Which is actually something: I have been insulted, and spammed, but never both at once in a single comment, until now.

So I responded in kind to your comment, I guess. Lucky you.

The comments to this entry are closed.

Search The Backup Blog

  • Search

    WWW
    thebackupblog