« V-Max and Backup? | Main | Do You Know How Much You Backup? »

April 17, 2009

Comments

Feed You can follow this conversation by subscribing to the comment feed for this post.

Joe Matuscak

While I buy the numbers, it's not exactly an apples to apples comparison. TSM isn't doing client side de-dupe, so really, most of this is a "traditional" vs de-dupe difference.

What would be really interesting would be an Avamar to Commvault Simpana 8 comparison. That would pitch Avamar against another client side de-dupe implementation.

Roger

I'm not sure I like this table - it does not make sense to me. Might be a typo? But having pretty much the same backup running three times in a row and not having consistent times seems very weird? Why would one TSM backup run in 17 minutes and the rest of them take more than twice as long?

If all the TSM backups took 40 minutes I wouldn't be as surprised (no fan of TSM, never been, never will be).

But if the time of an incremental backup is filesys traversing + data transfer (as most generic systems work anyway). Then I suspect a flaw in your tests?

As an aside - how large a filesys? How many files are we talking?

And - how long did it take with NetWorker? (Known to be quite fast)

Preston de Guise

I think you may have missed the point in the original post – it looked like the intent was to compare dedupe to non-dedupe in order to demonstrate the time and network bandwidth savings that can be made by using dedupe backup software.

Jim H

Shhhh. Don't talk about restores. Those aren't important.

Scott Waterhouse

Roger;

3 different backups on 3 different days on a production filesystem. Sorry, don't know # of files or size, but I suspect it is relatively small. If it was in the millions I would expect TSM to take much longer, and Avamar's advantage to be more pronounced.

Jim: on restores, it is ironic that you raise the issue when I am comparing Avamar to TSM. TSM has some of the worst restore times in the industry (unless you are asking for one file from last night's backup which is still in the disk pool). Backups are spread out over multiple tapes, require lots of loads and unloads, and the only way to combat it is with co-location and aggressive reclamation (which end up requiring lots of tape drives). All in all TSM probably has the worst restore times of any major backup product.

And still no file level restore from a VCB? And still no coherent 3 tier architecture? ;)

I have seen no evidence (and I have looked) that Avamar has an issue with restore relative to other backup products.

NK

Wow.. Scott on that last post you are very wrong.

TSM does not have the worst restore time in the industry.. although it can when it's poorly configured. TSM's biggest problem is it's complexity.

Actually having files spread across lots of tapes can be a very good thing if you have a Disk/VTL as the target pool, the loads and unloads are negligible and reclamation runs very quickly. On restores you get lots of recovery threads that all work in parallel that actually make it very very fast.

File level recovery from VCB? Yes, it's had that for a while too. Does Avamar have a 3 tier architecture? ;-)

Scott Waterhouse

NK;

Good points on VTL vs. tape. I maintain that if you are using tape, then TSM has some of the worst restore times of any backup product. TSM holds the record for the longest restore time I have ever heard of: 5 weeks.

If you are using VTL or a lot of disk, then TSM tends to work better. But despite the passion and fervour I have for VTL, and that any TSM admin has who has actually used it, I think about 50% or more of TSM users are still using tape, primarily.

As for Avamar and a 3 tier architecture, it doesn't need it. The primary reason NetWorker and NetBackup use it is to distribute I/O load. Well, Avamar reduces I/O on the network by 90% or more. So no need for a storage node or media server equivalent. TSM on the other hand should use them (and does, technically have a capability to do so) but in practice almost never does. TSM environments tend to have more master servers per TB than any other backup product, by far. (I would guess it would be a factor of 10x or more...)

NK

Scott,

I don't think that the sole reason for a 3 tier architecture is to distribute I/O. In most cases, you add additional TSM Servers not because the server hasn't kept up with the workload, but because the TSM database has grown too large.. it's pretty rare to see people spin off additional TSM servers simply because the I/O workload is too great.

While I feel that Avamar does scale pretty well, it isn't limitless either.

Don't get me wrong, I think Avamar is a good product, but not good at everything.

The thing that I'm simply not convinced about with any of the de-dupe technologies (not just EMC)is how they really benefit restore performance. If you're protecting say a 5TB database and the thing goes tits up then you still have to recover 5TB. You may have de-duped all your backups of that DB down to 5 or 6TB, and your snart incremental backup technology results in only a few MBs being transferred daily.. but when it goes dowm.. guess what.. you still have a 5TB database to recover. Will that recover quickly off a de-duped back end performing what will ultimately be some sequential mixed with some random I/O against a bunch of disks or from a streaming tape device... and what happens if you have to recover 3 or 4 of these at the same time in a larger scale problem.. I can't see these dedupe system keeping up.

That's not to say that it doesn't have it's place. There are some things Avamar can do, that other products simply can't.. or at least not as efficiently.

In summary your post of backup performance is just a little one dimensional.. it just doesn't cover all the bases. It's not a bad post. It highlights how Avamar sub-file level backup can help you get more out of a backup window when backing up files with fairly static content.. that is a big deal, but it's not the whole deal.

Scott Waterhouse

NK;

Again, lots of solid points here.

The sole reason for 3 tier? No. But in TSM's case, you are right but really you are highlighting that there are two fundamental problems: scalability of the server in terms of I/O and scalability of the db. The db scalability was terrible (100 GB?!) and we will have to wait for the real world to come back with some experience now that IBM have changed to a DB2 backend to see how that impacts. But even if that doubles (or more) they still have a fundamental issue of I/O constraints and the need for another tier to distribbute it.

And agreed totally about the performance dynamic of restore and Avamar. Yes the post only dealt with backup and not restore, and I would never urge people to overlook the importance of restore. Further--if somebody is trying to sell you Avamar for a 5 TB database backup then somebody probably has some explaining to do. Because this would not normally be anywhere near the standard acceptable use case for Avamar!

But this is exactly why I rant about the DL4000 and dedup--you need to consider not just a backup SLA/RTO, but a recovery SLA/RTO. Sometimes dedup is not sufficient to meet those (not from us, not from anybody); sometimes it is; and sometimes it is close but requires specific architectural steps to ensure you are OK.

PC

Have you guys actually used Avamar or are you so focused on the poor run of the mill backup software like TSM or Networker that you don't give the backup of the future a chance? With Avamar restores will be amazingly fast and yes even with a disaster situation. The grid will take in only what is unique do we all agree on that? So basically when a restore runs it can restore any successful backup even if it only moved 10% of the actual data during the backup being restored. So in a disaster situation the restore will fly. Compare that to lets say Networker or TSM with a tape library as the target and your looking at a complete nightmare trying to recover even a file never mind a whole server and forget about nas restores. You guys need to open your eyes to the new technology. The main reason outside of money why companies are still using the most outdated of technolgies and wondering why their restores are failing or taking for ever is having techs that simply wont embrace what new technology can do for them and the company. Open some eyes and ears gents and start doing your homework.

Scott Waterhouse

PC... largely agree with all that. Never underestimate the power of inertia. :)

Mike

Looking at the dates, I'm coming into this WAY late, but had to put in my two cents. While I agree about being open to new technology, you still have to look at it realistically.

I have TSM and Avamar both in place and I have a database of 4TB (admittedly, not Avamar's sweet spot) that I have to back up as flat file data. TSM requires me to back up about 2.5TB of that data each night. Avamar on the other hand is at about 300GB. Those numbers you cannot discount.

However, you also cannot discount the restore times. I can restore that entire 4TB database with a properly configured TSM instance in about 8 hours. With Avamar....21 hours.

Amazingly fast restore? Not so much.

Whether it ingested 10% of the data or not, it still needs to send 100% of it back to the client (although it does use some compression on the restore.) I agree with the comment about TSM's complexity, and this is a leading cause of many issues. However, take care to not wade too blindly into new technology...no single solution fits every situation, and you may find that it doesn't perform quite a well as the white paper says.

The comments to this entry are closed.

Search The Backup Blog

  • Search

    WWW
    thebackupblog