Building on the notion of a revised data protection and backup methodology described in my previous post, I think there is another important notion here: that of a data protection taxonomy.
As we think about moving backup away from a host-centric application, to a data-centric service, I think we need a way to consistently describe the data protection characteristics of a data set. This description needs to be completely independent of any storage array, application, data type, vendor, target, or network.
We need a simple, consistent means of sharing an entire data protection policy between any device or application responsible for providing data protection services.
I could then associate the data protection policy with a data set and any service provider for data protection could interpret it, and provide the mandated service level.
Basically, any data object could have such a policy associated with it. And I could bind that policy to it in all kinds of interesting places.
How about binding a policy to a VM and making it available through the APIs on ESX or VSphere? How about binding a policy to a database and making it available through the Oracle APIs? How about binding a policy to an OS? Better yet, how about binding it to a consistency group or LUN on an array?
At this point, with the appropriate credentials, any data protection service provider--be it archival services, Continuous Data Protection (CDP) services, backup services (hosted on an appliance, an array, in a traditional backup application)--can read or request the policy, and provide the required data protection service.
More practically: any service provider, from any vendor can act upon any data set, resident upon any storage.
How is that for no more vendor lock in?