Avamar and VMware Backup
Following up on the previous post, there is a second case where we have seen Avamar backup be enormously compelling, and that is the case of VMware backups.
Previously, I had said that there are really two things that Avamar does better than anything else: it is fast, and it uses very little bandwidth. And while the later is important in the case of remote backups, both of these are crucial when it comes to managing VMware backups.
Why do we care about VMware backup? Because basically VMware backups are painful. With a traditional backup application, we have two choices about how to backup VMware, and both of them are bad. (There is actually a third, and it is bad too.)
The first choice is to run a backup client in every guest OS that is running on an ESX server. However, when I have 8 or 10 or 15 or more guest OS instances running on a single physical system, I have a problem: both my CPU and my network bandwidth are constrained. CPU because I am hoping that each guest OS is only using 5% of the available CPU resource, give or take, and network, because all those guest OS virtual machines are often sharing a single (physical) GigE network connection. But when I fire up my backup process in each of those guests, all of the sudden I have 8 or 10 or 15 processes that would all happily use up much more than 5% of the CPU resources, and equally happily hog most, if not all of the available network resources. In short, I get (bad) contention for two constrained resources. This is not good at all.
The second choice I have is that I can do a VMware Consolidated Backup (VCB) from a proxy server. If I do this, I am saying the first choice is sufficiently painful that it is no choice at all. So I am going to take all the files associated with a given ESX server, and mount them to a proxy server. That proxy server will mount the VMware file system, and see a bunch of vmdk files--one for every guest OS. Normally, I would then use my "traditional" backup and recovery application to back up those vmdk files to my backup server. However, something awful happens along the way.
Because whenever I flip even a single bit inside a guest OS on an ESX server, the vmdk file changes. Remember, as an external server, I have no ability to peer inside that vmdk file. I am mounting one huge file that represents all of the data (OS, application, file, and state data) associated with a given virtual machine. Change even a single file, a single bit, and that vmdk file appears as a changed object to a backup application. These vmdk files can be 10 GB, 20 GB, even a 100 GB or more.
Which means: every time a single thing changes within the guest OS, even an incremental backup will capture the entire vmdk file as it will recognize that the file has changed.
If I have an ESX server running 10 guest OS instances, and each is allocated 30 GB, I have now allocated 300 GB to that ESX server. Which means, following the logic, that every day (if I am doing a VCB backup) I will back up 300 GB.
VCB backups eliminate any ability to do real incremental backup.
VCB backups result in the full backup of the entire ESX server every day.
In my example, I will backup 300 GB every day--irrespective of how much has changed on each guest OS.
Hello explosion in backup capacity. Which means that sometimes people choose the first painful way of doing backups, because the second painful way is just too expensive, and too painful.
Not really a choice that any of us would like to have, is it?
Enter Avamar.
Avamar makes both choices good choices.
In the first case, we can run an Avamar client in each guest OS. In this case, what we see is higher CPU utilization, but for much shorter periods of time. So, since each guest will only take 10% or 20% as long to back up, we can tolerate the higher CPU load. Further, and even better, network utilization is going to be drastically reduced. Avamar will only require about 0.5% or less of the network capacity that a full backup would require. A full backup (snapup) of 10 guest virtual machines would therefore only require 5% of the network capacity that a single full backup of a single virtual machine would require with a traditional backup application.
This is a big step forward: a 90% reduction in backup time, and a 99.5% reduction in bandwidth requirements.
The second choice is even more compelling. Particularly as we increase the number of ESX servers that we are responsible for.
We will still conduct a VCB style backup. The proxy server will mount a bunch of vmdk files, one per guest OS, just like before. But when Avamar gets its hands on the vmdk files, its tears them apart, and does a segment level deduplicated backup. Which means that rather than capturing the entire vmdk file with every backup, like a traditional backup application does, Avamar will capture only a tiny fraction of that backup. Probably less than 0.5% of the file.
Rather than back up an entire 30 GB vmdk file, every time, Avamar will backup 150 MB.
Even if we are experiencing exceptionally high change rates on the source data, this figure may only inflate to 1.5 GB (95% deduplication). A tremendous savings.
This is also a big step forward: a smaller, but still significant, reduction in backup time, and an equal 99.% reduction in bandwidth and storage requirements for backup.
Two other very good things happen too. First, because the backup consumes so much less time, you can backup up more ESX servers through a single ESX proxy. Experience would say that we can make the following generalization: you can back up twice as many ESX servers with a proxy VCB backup with Avamar as you can with a traditional backup application. Second, you have guest OS level object level restore. I can restore a single file (not the entire vmdk) from the perspective of the guest OS.
So, with Avamar, either means of backing up VMware becomes not just possible, but palatable. Both offer huge savings in time, CPU, and network bandwidth. Both offer huge savings in the required back end storage; in particular VCB backups can be accomplished with a massive savings in required backup storage.
With all that said, I am going to throw out one or two caveats around Avamar in the next post, as well as a modified version of my deduplication calculator that accounts for the differences between source (Avamar) and target (DL3D) deduplication.
One interesting thing mentioned at EMCWorld was the idea that you could do guest level Avamar backups and console level (vmdk) backups with or without VCB proxy in place. But more interesting than that, was that was the idea that because of the way Avamar segments and hashes the data that you would get de-duplication benefits of guest level and vmdk backups for the same guest OS instance. That at least makes sense in theory since it is, in fact, the same data. So you could do _both_ backup types at a relatively minimal cost. Not necessarily saying that is a wise or useful choice, but it's interesting to consider.
I wonder if you have any commentary or experience with Symantec's puredisk vs Avamar, which also does de-dupe at the source
Posted by: Scott Harney | May 29, 2008 at 10:35 AM
Absolutely true... you could do both, and end up with virtually no more retained backup data than just doing one.
Incidentally, console level backups are the third (bad) means of backup I alluded to. I am not sure that it is a good idea from a go forward point of view to do this--the console is just too thin on resources.
And yes I will do an Avamar vs. Puredisk comparison in the next couple of weeks.
Posted by: Scott Waterhouse | May 29, 2008 at 12:02 PM