Overview
Today’s IT organizations are performing disk-assisted backup and recovery in order to improve the performance and reliability of backup operations, and accelerate recovery processes. They are also increasing retention time on disk to weeks or months to improve the likelihood of data recovery from disk. This emphasis on disk-assisted recovery is related to organizations’ low tolerance for downtime: as shown in Figure 1, ESG survey respondents indicated that 74% of Tier 1 applications, 47% of Tier 2 applications, and 26% of Tier 3 applications could withstand only 3 hours or less of downtime.[1]
But it’s not just downtime that is of concern to IT; data loss is equally challenging. To address it, backups are being performed with greater frequency. ESG research found that 43% of respondents cited making copies of Tier 1 data every hour or at fractional points within the hour.

When it comes to protecting data for virtual server environments, some complications emerge. Virtual machines running on a physical server share available resources. Simultaneous resource-intensive processes occurring within virtual machines on the same physical host could cause resource contention—adversely affecting the pool of application workloads sharing common physical resources and causing performance issues.
What does this have to do with backup? There are two backup strategies: protecting the data files within the virtual machine and protecting the virtual disks.
- Protecting data files: Data protection approaches using agent technology in the guest operating system to facilitate file- or application-level data protection can be resource-intensive, which could trigger the aforementioned resource contention.
- Protecting virtual disks: Alternatively, protecting virtual machine files as a whole requires quiescing the virtual machine before capturing the disk image. This can be facilitated by hypervisor-level snapshot APIs that the server virtualization vendors have put in place. This approach, however, captures only a crash-consistent image, unless additional scripting is performed.
A crash-consistent image may or may not capture a transaction-consistent image of the application, which is critical for rapid, reliable recovery of database and e-mail applications. Recovery using a crash-consistent image taken at the hypervisor-level does not always result in reliable recovery and is clearly more complex and time-consuming than recovering from an application-consistent image. Using a crash-consistent backup for recovery is more likely to increase downtime and recovery risk.
What’s needed to overcome all of these challenges to protecting applications running in virtual machines? An application-aware recovery solution that offers low overhead, application-consistent recovery options as well as very granular recovery, regardless of the hypervisor in use. InMage offers just such a solution.
Analysis
Enter Continuous Data Protection (CDP) Technology
Not only has CDP technology come a long way in the last several years, but business and IT requirements have reached a point where it is sorely needed. The challenges IT organizations face in protecting data have been exacerbated by data growth, increased criticality of applications that drive core business activities, and wider deployment of production applications running in virtual machines. Increased volume of data makes it difficult for IT to complete backups within prescribed time windows, the 24/7 nature of business leaves little to no room for downtime, and running production applications in virtual machines creates a dilemma for traditional backup and recovery.
CDP technology continuously captures changes to data at a file, block, or application level, supporting very granular data capture and recovery options. Using one of several methods, CDP time stamps every write and mirrors it to a retention log. When a recovery is needed, the CDP engine creates an image of the volume for the point in time requested without disrupting the production application.
- Block-level CDP operates at the logical volume level and records every write. This type of CDP stands out at transparent data capture and presentation of views at different points in time. It protects applications without understanding application specifics.
- File-level CDP operates at the file system level and records any changes to the file system. This type of CDP typically runs on the same server as the application it’s protecting.
- Application-aware CDP can be layered on top of either block or file level CDP and tracks not only every crash-consistent recovery point (enabling a rollback to any previous point in time), but can also track critical application process points within the CDP data stream that can greatly simplify recovery, such as transaction-consistent database checkpoints (for Oracle or SQL, for example) or application-consistent points within e-mail applications (such as Exchange, for example).
CDP provides two key data protection advantages. First, it completely eliminates discrete backups, replacing them with a transparent, continuous data capture process that puts very low overhead on production servers. Second, because it captures data as it is created, that data is immediately recoverable. This allows CDP-based solutions to deliver near-zero RPOs.
InMage offers a block-level, application-aware CDP solution that protects application and data workloads in Windows, Linux, and UNIX operating systems running on physical or virtual (VMware, Microsoft, and Citrix) machines. InMage uses low overhead filter drivers to capture changes to workloads in real time, writing them to an InMage appliance on the LAN and from there to one or more recovery targets that can be remote and/or local (the data flow is server source to appliance to server targets). But InMage has not just developed a CDP engine; the company has combined it with asynchronous replication, WAN optimization, and application failover/failback in a single, integrated recovery platform that supports rapid, reliable recovery for both data and applications locally or remotely.
Application Consistency
The distinction between crash- and transaction-consistency really differentiates CDP solutions. When databases are shut down normally, all transactions are complete: data in the transaction logs is flushed and in-process transactions are completely written to disk, resulting in a disk-based image that is usable by the database. Transaction-consistent images speed recovery by minimizing database administrator intervention.
If a disruption occurs in an application environment before all transactions have been successfully written to disk, only a crash-consistent image will be available for recovery. A database administrator can roll back to a known transaction-consistent state and then apply transaction logs to get the restored image to a usable state at the point in time closest to the disruption. This process, however, can take longer and some data could be lost.
As an alternative, InMage takes advantage of application snapshot APIs (VSS and Oracle RMAN for example) to track consistency points within the CDP data stream without interrupting the application or requiring significant overhead on the server. Instead of creating an actual snapshot each time the API is invoked, InMage just marks the consistency “check point,” which takes considerably less time and has less impact on the application. Recovering data is just as streamlined since any check point can be converted into a recovery point without impacting production applications.
Implications in Virtual Server Environments
IT organizations want the low overhead of virtual machine-level backup with the application-consistent recovery of an application-level backup. Unfortunately, resource-intensive backup processes running in virtual machines sharing physical resources can spell trouble and hypervisor-level APIs can only provide application-consistent recovery points if special scripts are written and run. The poor reliability of hypervisor-specific snapshots can increase recovery time and the possibility of data loss.
InMage uses a combination of CDP, application snapshot APIs, and asynchronous replication to create a low overhead recovery solution that presents both application- and crash-consistent recovery points for any hypervisor environment. The CDP retention log, which can be maintained locally and/or replicated to one or more remote targets, keeps a time-ordered list of all writes for a given application but can punctuate that timeline with markers indicating application-consistent recovery points (which InMage calls AppShots). A filter driver (called a data tap) running in the physical server or in the guest operating system provides a mechanism to enforce application consistency, but consumes very little resources. And the approach is consistent for all physical and virtual environments.
The Bigger Truth
Research respondents to ESG’s 2010 IT Spending Intentions survey prioritized “improving virtual machine backup and recovery” second behind physical-to-virtual consolidation efforts. As server virtualization gains in adoption, organizations are recognizing the need to reconcile risk mitigation with the most cost- and operationally-efficient approaches.
InMage’s solution captures data continuously, eliminating the need for a backup window and providing extremely granular recovery options. Its host-offload architecture makes it well suited for application workloads running on physical or virtual machines as it minimizes overhead, reducing the impact of data protection operations on applications. Finally, InMage’s technique for capturing transaction-consistent images delivers reliability in addition to reducing downtime and data loss. InMage is well matched for overcoming the challenges of protecting application workloads in virtual environments.
[1] Source: ESG Research Report, Data Protection Market Trends, April 2010.





