I am thinking about changing the name of my blog from "The Backup Blog" to "The Avamar Blog".
OK, not really, but given the enormous interest I have had in Avamar (based on email responses and questions) I am going to keep posting on it for the next short while. I may mix it up and include some other stuff, but based on feedback there are at least 4 things I want to get to in the area of Avamar over the next couple weeks:
- Miscellaneous Avamar stuff not covered in previous posts
- Instantaneous VMware backup with Avamar
- Deduplication calculator that is Avamar aware
- Key differences between Avamar and PureDisk
First off, Avamar miscellany, bits, and trivia.
Really, just a couple of points here. (The only trivia I have is: what professional motorcycle team did a senior Avamar technical marketing resource race for in Europe? If somebody can answer that, we will have to find a prize for you, because the answer is pretty obscure!)
So, on a more serious note: scalability is a criticism that is often raised regarding Avamar. Frankly, since nobody seems willing to say anything more specific than "scalability is an issue" I have trouble figuring out what the critics are talking about. But, to be fair, I can think of one boundary on Avamar performance: everything is over IP. So when you have a backup client that needs to transmit data to the server, that is always over IP. There is no such thing as a FC connection to Avamar, and no such thing as a media server, storage node, dedicated storage node, SAN media server, or any other such contrivance found in other backup applications. So if you need more throughput than a single 1 Gb Ethernet connection can provide then you need to either add additional Ethernet adapters, or employ proxy backup servers.
Which leads to the second and last point: performance. Performance on backup will be limited be available IP performance out of the client (the server will likely have many Gb ports). Performance on restore will likewise likely be limited by available bandwidth coming into the client being restored. On the server side, many nodes can and will contribute to a restore request, so it is almost inevitably the client's available bandwidth that will bottleneck a restore.
So, on that basis, we can conclude that the bottlenecks and scalability issues faced by Avamar are really very similar to those that other backup applications experience. The client is usually the first or final bottleneck on backup and restore respectively. Getting data to the server is usually the biggest challenge!
I assume you can trunk multiple GigE interfaces together on the underlying Avamar OS. since it has a linux kernel core and drivers it should be possible. This is how we operate today with networker servers. It makes it simple from the client configuration side in that you don't have to decide to send data to storage-node-eth1 or storage-node-eth2 to manually distribute load. you just send it all to the same trunked interface.
And I agree, these "problems" are germaine to growing any backup solution.
Some of the buzzing I have heard about Avamar scalability relate to the size of the backend storage for a single avamar instance as well as for the object index. The index issue (database with number of objects stored)is yet another problem that is typical to scaling any individual backup server instance. It may also be a bit of a red herring. At a certain point, managing a single gigantic single backup server is unwieldy from a human perspective even if the software can technically "scale" to that level.
Posted by: Scott Harney | June 03, 2008 at 02:33 PM
Well, it is more a matter of automatically load balancing across multiple GigE adapters. (It is a nifty bit of software engineering that relies on the hash index of a segment to determine which node it gets transmitted to.) Which means that a 10 node Avamar cluster has an effective bandwidth of 10 Gb. The good news here is that this is all "automagic" no special client or server configuration required--so same net effect as you describe.
As far as the scaling of the index, we can store somewhere between 1 and 10 PB of data per Avamar cluster (as with a lot of things in deduplication land, it depends on data and retention policy). But I would point out that is scalability at least as good as anybody else is claiming for their deduplication appliances. It is fair to say that scaling is no more an issue for Avamar than it is for any other deduplication solution. Or backup application.
Posted by: Scott | June 03, 2008 at 04:23 PM
Hi, I am very new to Avamar. and having a difficulty to understand how do i migrate backups that have taken by my client on TSM server. We are changing it to Avamar. any thing you can help.
Posted by: abhishek | August 10, 2011 at 07:04 AM