One of the most important, and most frequent questions I get asked is this: what is the best way to back up VMware?
Well, "best" tends to be a bit of a slippery term. Usually the answer to the question should be "it depends." But I thought I would take the time to explore what it really means when we say "it depends." What does it depend on? What are the pros and cons of the different approaches? What are the most important considerations? And what new choices do we have with vSphere 4.0?
Now I am going to explore the answer in the context of Avamar. Mostly I am going to do so because I work for EMC and it is convenient to be able to put a label on these products and processes. Having said that, almost everything I say about the costs and benefits, the value of the various approaches with respect to each other, and the drawbacks of the various approaches is true irrespective of the backup product you are using. Guest backup has certain costs and certain benefits--no matter if you use Avamar or something else. Where benefits are specific to Avamar I will identify them. And certain things that that I identify as unique may only be unique as of the time of this writing; clearly some of the functionality is so desirable that other backup application vendors will be trying very hard to get this capability into their shipping product.
All that said, I think there are two primary methods of protecting a VM: guest level backup and image level backup. There are a couple of ways in which the number of options multiply however--do you use source deduplication or target deduplication? Do you want to use the array to generate a copy for image level backup, or do you want to rely on ESX/vSphere?
But lets take a look at the two primary methods. First: guest level backup (depicted below).
In this case, we really treat each VM as a physical machine. Each VM has a instance of an Avamar agent (backup client) installed. Each agent behaves as it would on a stand-alone physical machine and inspects the file system and data associated with that machine for changes every day. If there is a specific application that requires special backup integration (Exchange or Oracle for example) with a special backup agent, the Avamar agent goes about that activity and talks to VSS or RMAN just as it would on a physical system.
The fact that guest level backup treats a VM as a physical box is both it's great strength, and it's great weakness. Great strength, in that you don't really have to do anything different than you did before: the process and procedure of backup is unchanged. Now Avamar has a huge advantage here because the duration of backup, and the CPU and network load of backup on your ESX server will be vastly smaller with Avamar than with any other backup application. Avamar is as much as 90% faster than a "standard" backup from TSM, or NBU, or CV. And Avamar will reduce network load by as much as 99.5% or more. But those issues aside, guest level backup is much the same as the backup of a physical system.
Great weakness, because I can't leverage any of the unique functionality of VMware/ESX to improve data protection.
The second method--image level backup--really comes in two types. The first is the type you would do if you don't have Avamar 5.0 and vSphere 4.0: it is a VCB backup, depicted below.
In the case of a VCB backup, the ESX server will make a copy of the data associated with each VM in a sequential manner. In the case pictured above, where we have a mere 3 VMs per physical host, this means that the ESX server will copy the files associated with VM1 to a separate LUN (which is mostly the .vmdk file), then the files for VM2, thenn the file for VM3. Those copies are then mounted to a proxy server, on which an Avamar agent runs, and backups are performed from this proxy server.
Practically speaking, not a lot of folks are doing this. And the biggest reason for this is that it takes a long time to generate those VCB copies--about 4 hours per TB of source data (by way of a huge generalization). Secondarily, or the flip side of this coin, is that I am going to need a more than one proxy server--probably at least one for every 5 to 10 physical ESX server. So in a very large environment, this can mean a lot of additional infrastructure, and associated costs.
So, the VMware folks are pretty smart, and realized that traditional VCB backups were not all that attractive to the majority of users. As a result, vSphere 4.0 has implemented a number of improvements through the vStorage APIs to enable better data protection. Chad Sakac refers to this as "son of VCB" but truly it is just another way of generating an image level backup. It is depicted below.
In the case of vSphere 4.0 (and currently unique to Avamar 5.0 of all major backup products--and just keep your hats on my Veeam friends) this means that when I want to do a backup, the ESX host will have an additional VM, which is a proxy server for backups. The Avamar agent will reside only within this VM, and not within the individual VMs hosting guest OS images and applications.
During the course of normal operations, the ESX server tracks what blocks are physically changing within the VMFS file system. When a backup happens the ESX server makes the changed blocks only available to the proxy server via a SCSI hot add command. The storage associated with the changed blocks is mounted by the Avamar agent in the proxy server (one each of Windows and Linux). The agent then inspects the data for commonality (this is source deduplication) and send the unique backup objects to the Avamar server.
This type of image level backup captures the complete .vmdk for backup, but (very different than a VCB) also allows file level restore. So far the file level restore is for Windows only, and is unique to Avamar (single pass backup).
So, guest level backups, and two major types of image level backups. Those are the choices. But why would I do one rather than the other? Which is "better"? All of that in the next post which you can find here.