VMware Site Recovery for VMware Cloud FAQ
Introduction and General Information
- What is the VMware Site Recovery Service?
VMware Site Recovery brings trusted replication, orchestration, and automation technologies to VMware Cloud to protect applications in the event of site failures. The service is built on an industry-leading recovery plan automation solution that includes VMware Site Recovery Manager™ and native hypervisor-based replication via VMware vSphere® Replication™. The service provides an end-to-end disaster recovery solution that can help reduce the requirements for a secondary recovery site, accelerate time-to-protection, and simplify disaster recovery operations.
- Where is VMware Site Recovery available today?
The service is available in all regions where VMware Cloud on AWS is available, including AWS GovCloud (US) region.
- What protection configurations are supported?
VMware Site Recovery can protect:
- Workloads running in an on-premises data center to a VMware Cloud on AWS SDDC
- Workloads running on a VMware Cloud on AWS SDDC to an on-premises data center
- Between different VMware Cloud on AWS SDDCs
- What are the differences between VMware Site Recovery and Site Recovery Manager?
Site Recovery Manager is software that can be installed and managed by customers and used on-premises as well as with hyperscalers like Azure VMware Solutions, Google Cloud VMware Engine and Oracle VMware Solution.
VMware Site Recovery is a service from VMware. It is supported on both VMware Cloud on AWS and VMware Cloud on Dell. It is managed and maintained by VMware and available for use as a service by customers.
- What is the minimum version of vCenter required at the paired on-premises datacenter to support VMware Site Recovery?
The version of vCenter required at the paired on-premises datacenter to support VMware Site Recovery depends on the version of Site Recovery Manager and vSphere Replication deployed on the paired on-premises datacenter. VMware Product Interoperability Matrices between VMware vCenter Server and Site Recovery Manager here can be used to find out the minimum version of vCenter needed based on the version of Site Recovery Manager deployed on the paired on-premises datacenter. Similarly, VMware Product Interoperability Matrices between VMware vCenter Server and vSphere Replication here can be used to find out the minimum version of vCenter needed based on the version of vSphere Replication deployed on the paired on-premises datacenter. For example, if the current version of Site Recovery Manager and vSphere Replication deployed on the paired on-premises datacenter is 8.2, the minimum version of vCenter supported is 6.0 U3 based on the VMware Product Interoperability Matrices. Note that, Site Recovery Manager and vSphere Replication on the paired on-premises data center can have a minimum of N-1 version of the Site Recovery Manager and vSphere Replication components of VMware Site Recovery on VMware Cloud on AWS SDDC, as explained in the next FAQ
- Does the on-premises version of vSphere, vCenter and Site Recovery Manager have to match those deployed in VMware Cloud?
No. VMware Site Recovery was designed to provide flexibility in the versions of the components deployed by a customer in their on-premises datacenter and those deployed and managed by VMware in VMware Cloud on AWS. VMware Site Recovery is compatible with the N-1 version of Site Recovery Manager and vSphere Replication on the paired on-premises datacenter. For example, if the current version of VMware Site Recovery is 8.5, the supported versions for Site Recovery Manager and vSphere Replication on the paired on-premises datacenter are 8.4 and later.
Pricing and Licensing
- How is VMware Site Recovery service packaged and priced?
VMware Site Recovery is a separate, add-on service that is priced and charged separately from VMware Cloud on AWS. Please visit the pricing page for the latest information on pricing. The list price of VMware Site Recovery includes the Site Recovery Manager and vSphere Replication components for both the VMware Cloud on AWS SDDC instance and the on-premises data center. The pricing also includes support.
- Can I apply existing VMware Site Recovery Manager (SRM) licenses to enable VMware Site Recovery?
No, VMware Site Recovery service is a separately priced and licensed solution. Please visit the pricing page for the latest information on pricing.
- Are there any additional charges to use VMware Site Recovery in a multi-site configuration?
There are no additional charges to use VMware Site Recovery in a multi-site configuration such as fan-in, fan-out or other complex topologies. The standard pricing applies to all of the virtual machines you protect using VMware Site Recovery.
- Are vCenter Server licenses required for both the protected and recovery sites?
Yes, Site Recovery Manager requires two active and licensed vCenter Server instances, one at each site (protected and recovery).
NOTE: The shared recovery sites feature in Site Recovery Manager enables multiple protected sites with multiple vCenter Server instances to be recovered at a site with a single vCenter Server instance. (i.e., the multiple instances of Site Recovery Manager running at the shared recovery site are registered with the same single instance of vCenter Server at the shared recovery site, so you do not need multiple vCenter Server instances at the shared recovery site.).
- After failover, what are the license requirements for failback?
VMware Site Recovery is billed per replicated VM, regardless of the direction. The pricing for failback is the same as failover.
- What license keys does VMware Site Recovery use?
VMware Site Recovery uses a special license key that is added and managed by the service. No user involvement is required.
- Which replication software is supported?
VMware Site Recovery currently only supports/utilizes vSphere Replication.
- Can VMware Site Recovery protect workloads on physical servers?
Site Recovery Manager orchestrates the recovery process for virtual machines. In cases in which some workloads are running on physical servers with a separate disaster recovery solution, Site Recovery Manager can coordinate the recovery process by allowing users to create custom scripts that ensure that workloads are restored in the appropriate order.
- Do I need to use a specific version of vSphere and vSphere Replication in my on-premises data center to take advantage of the new feature "Seamless Disk Resizing"?
Yes, you need to deploy version 7.0 (or later) of vSphere and version 8.3 (or later) of vSphere Replication in your on-premises datacenter to take advantage of the new feature "Seamless Disk Resizing".
- Does VMware Site Recovery provide automated failback?
Yes, VMware Site Recovery provides automated failback. The first step is to perform a “reprotect” of the virtual machines from the failover site to the original production site. This consists in coordinating the reversal of replication to the original site and mapping virtual machines back to their original virtual machine folders, virtual switches, and resource pools. The second step is to execute the planned migration back to the original site, using the original recovery plan executed in the reverse direction.
- What is the difference between planned migration and DR failover?
Planned migration and DR failover both leverage the same recovery plans. DR failover is used in the event of a disaster and is designed to quickly recover virtual machines at the failover site. Planned migration is used for preventive failovers or for routine migrations. Planned migration ensures an orderly shutdown of virtual machines at the protected site, synchronizes the data with the failover site by ensuring complete replication of all the data, and finally recovers virtual machines at the failover site. Planned migration ensures application consistency to the secondary site with no data loss.
- Does Site Recovery Manager provide application-consistent or crash-consistent recovery?
vSphere Replication supports optional VSS-based application consistency for Windows environments and Linux file system quiescing. When executing a planned migration (as opposed to DR failover), VMware Site Recovery provides fully application-consistent migrations between sites, since virtual machines are gracefully shut down before completing replication and initiating the recovery plan.
- Does VMware Site Recovery support active/active sites?
Yes, VMware Site Recovery supports configurations in which both sites are running active virtual machines that VMware Site Recovery can recover at the other site. VMware Site Recovery also supports active/passive sites in which Site Recovery Manager recovers virtual machines from a protected site at a recovery site that is not running other virtual machines during normal operation.
In an active/active scenario, users configure recovery plan workflows in one direction from Site 1 to Site 2 for the protected virtual machines at Site 1. Recovery plan workflows are configured in the opposite direction from Site 2 to Site 1 for the protected virtual machines at Site 2.
- Does Site Recovery Manager support a many-to-one disaster recovery configuration?
Yes. Site Recovery Manager provides the option to protect multiple sites using a common “shared recovery site”. At this shared recovery site, you will still need to have multiple instances of Site Recovery Manager running. Each instance manages the pairing with one of the protected sites. However, to provide simpler disaster recovery management in a many-to-one configuration, only one instance of vCenter Server is required at the shared recovery site. All instances of Site Recovery Manager register with that single vCenter Server instance. Please consult the product documentation for more details on how to use this feature.
In addition to the shared recovery site configuration, Site Recovery Manager also allows and supports shared protected site (1:N) and many-to-many (N:N) configurations. It is also supported to begin with a standard two site SRM deployment and later on add additional site pairings to add in more complex topologies. Keep in mind that while Site Recovery Manager does allow for the failover of different VMs to different sites, it does not support the failover of the same VM to multiple recovery sites. Site Recovery Manager only supports a VM being protected by a single Site Recovery Manager pair.
- Does VMware Site Recovery require two active vCenter Server instances?
Yes, Site Recovery Manager requires two active and licensed vCenter Server instances, one at each site (protected and recovery). NOTE: The shared recovery/protected sites feature in Site Recovery Manager enables multiple protected or recovery sites with multiple vCenter Server instances to be recovered or protected at a site with a single vCenter Server instance. (I.e., the multiple instances of Site Recovery Manager running at the shared recovery/protection sites are registered with the same single instance of vCenter Server at the shared recovery/protection site, so you do not need multiple vCenter Server instances at the shared recovery/protection site.).
- What changes and doesn't change when VMware Site Recovery fails over a VM?
SRM is coordinating the replication of the VMX file, and moving the VM to a new vCenter, so the attributes it preserves are the ones that are in the VMX file, and unrelated specifically to the protected site vCenter.
What is preserved:
- GUID (note that the placeholder VM at the recovery site will have it's own UUID. However after a failover, the recovered VM will have the same UUID as it did at the protected site)
- MAC address
- VM config (nics, drives, etc)
What is not preserved
- Reservations/limits (these can be configured on the placeholder, or even better, use a resource pool and map it in SRM)
- DRS configurations (affinity/anti-affinity rules, DRS groups, etc)
- VM permissions
More details on reservations, affinity rules and limits:
When Site Recovery Manager recovers a virtual machine to the recovery site, it does not preserve any reservations, affinity rules, or limits that you have placed on the virtual machine. Site Recovery Manager does not preserve reservations, affinity rules, and limits on the recovery site because the recovery site might have different resource requirements to the protected site.
You can set reservations, affinity rules, and limits for recovered virtual machines by configuring reservations and limits on the resource pools on the recovery site and setting up the resource pool mapping accordingly. Alternatively, you can set reservations, affinity rules, or limits manually on the placeholder virtual machines on the recovery site.
- Can VMware Site Recovery failover automatically?
Technically yes. Is it recommended? No. SRM workflows, including failover, can be triggered by a script. However, In almost all scenarios, falling over in an automated fashion is a poor idea. There is a lot of risk associated with it and a lot of potential liability for failing over due to incorrect reasoning. Failing over automatically in test mode, however, makes a lot of sense For more details see this blog post – Automating Failover with SRM and PowerCLI
- What are the requirements for having SRM change VM IP addresses?
There are two requirements.
1. The VM must have a supported version of VMtools installed
2. The OS on the VM must be compatible with vCenter's Guest OS Customization feature. This can be checked here
- Does Site Recovery Manager support virtual machines using raw disk mapping (RDM) disks?
Yes, Site Recovery Manager provides full support for virtual machines using RDMs.
- Does Site Recovery Manager require that protected site and recovery site networks be the same?
No. Site Recovery Manager can change the IP address and VLAN of virtual machines at the time of recovery to the configuration the user specifies during setup.
- Does Site Recovery Manager update Domain Name System (DNS) tables at the recovery site?
Site Recovery Manager can update the IP address and virtual switch for recovered virtual machines but does not update DNS tables at the recovery site. However, both Windows and Linux have dynamic DNS options that can do this. There is an example script for updating Microsoft DNS and BIND servers included in the scripts folder on Site Recovery Manager servers.
- During failover, are virtual machines shut down and started serially or in parallel?
Virtual machines are shut down in the reverse order that they are powered on in. The user can specify the order in which virtual machines must be started, either serially because of dependencies, or in parallel if required.
- How much overhead does Site Recovery Manager place on each virtual machine?
Site Recovery Manager does not run any components in the virtual machine or on the vSphere ESX® server during normal operation, so it does not affect the performance of virtual machines.
- How much bandwidth is required between sites?
Bandwidth requirements depend on the amount of data being replicated, the frequency of replication and the specific replication software. Site Recovery Manager sends very little information between sites itself and as a result generally has no impact on the bandwidth required between sites. If using vSphere Replication, use the vSphere Replication Bandwidth Calculator to estimate bandwidth requirements. If using array-based replication, your replication vendor can help to determine the required bandwidth for replication.
- Does Site Recovery Manager verify that the virtual machines have booted successfully at the recovery site?
Yes. Site Recovery Manager monitors whether VMware Tools has started running in each virtual machine to determine whether the virtual machines have booted successfully.
- If Site Recovery Manager is running in a virtual machine and the virtual machine fails, can Site Recovery Manager still execute failover?
Execution of recovery does not depend on the vCenter Server or Site Recovery Manager Service at the protected site. However, recovery does depend on having a running vCenter Server and Site Recovery Manager Service at the recovery site. When the Site Recovery Manager Service is running in a virtual machine, vSphere High Availability can be used to restart the Site Recovery Manager virtual machine in the event of a physical server failure.
- How does Site Recovery Manager handle loss of network connectivity between sites?
Site Recovery Manager notifies the administrator when it cannot connect to the remote site. Failover is always manually initiated to avoid split-brain scenarios. Recovery does not require connectivity to the protected site.
- Are logs of test and failover execution exportable from Site Recovery Manager?
Yes. They are available in the History section of each Recovery Plan.
- Can Site Recovery Manager automatically initiate failover?
Site Recovery Manager does not automatically initiate failover. Failover initiation must be done manually. A best practice is to strictly limit which users have permission to initiate failover. Site Recovery Manager does include an SDK that can be used to externally initiate failover if required.
- If we have two sites with enough bandwidth between them, why do we need Site Recovery Manager rather than just using vSphere vMotion® between the sites?
vSphere vMotion® is useful only when the virtual machine is still running. If an outage occurs, vSphere vMotion has no running virtual machine to operate on. Site Recovery Manager is designed to handle cases in which virtual machines are no longer running at the production site because of an outage and must be recovered at a recovery site.
- Can Site Recovery Manager be integrated with other disaster recovery management software?
VMware provides an SDK for Site Recovery Manager that enables some degree of custom integration with other disaster recovery software. Site Recovery Manager does not provide built-in integration with third-party software products other than array-based replication software.
- What happens if I run a DR workflow with the sites disconnected and then want to use my recovery plan again?
After running the recovery connect the sites (first making sure that the originally protected VMs are powered off) and run a planned migration of the recovery plan that they had done a disaster recovery of. SRM is intelligent enough to know that it has already performed a failover and will skip unnecessary steps. Rerunning the plan will ensure that the steps that weren't completed at the original site are completed and get's the plan back into the appropriate state to reverse the direction of replication.
- How is a forced failover different from a regular failover?
Forced recovery was introduced to handle a scenario where the source array is down but the vSphere layer is up. In early versions of SRM that scenario meant failovers would sometimes timeout waiting for a reply. Forced failover fixes that bluntly by telling SRM to override its normal safety checks so be careful how you use it. If the original site returns after running a forced recovery, to get things back in sync, run the failover again, then run it as a planned migration.
- Do I need to have VMtools installed on my VMs?
Having VMtools installed on VMs recovered by SRM is not required, though it is helpful. SRM will use VMtools for a few things related to recovery:
- VMtools will be used to shut down VMs gracefully as opposed to powering them off
- VMtools heartbeats will be used to let SRM know that a VM is ready and to start the next VM or step in the recovery plan
- VMtools are used to customize IP addresses when supported by the VM OS
- What if I want different recovery settings (IP address, start order, etc) for the same VM in different recovery plans?
That is currently not supported. Customization settings in SRM are associated with protected virtual machines. If the same protected virtual machine is a part of multiple recovery plans, then all recovery plans use a single copy of the customization settings.
- Can I have a VM get one address during a test and another during an actual failover?
You configure IP customization as part of the process of configuring the recovery properties of a virtual machine so when using the SRM IP customization, a VM will receive the same IP address in both a Test and a Recovery.
It would be possible to write and run a script as part of recovery to detect whether a test or actual failover was being run and to have the script customize the IP address accordingly. That said, the main idea of running a test with SRM is to duplicate the actual failover so using different addresses for test and recovery doesn’t fit with that, which is why it isn’t supported directly in the product.
- Can I have multiple SRAs communicating with the same array?
Yes. In general, SRAs always see and report all replicated devices they see in the array. The list they return can be further filtered
- Some SRAs have the capability to filter devices based on some prefix. In this case, the SRA simply does not even return to SRM any device that starts with “foo*”, for example. The prefix is typically specified in the connection parameters when creating an Array Manager entry in SRM.
- SRM performs device matching only between the two array managers it knows.
- For a given replicated device pair, SRM will find a matching device in VC, for example, a datastore or an RDM. Replicated devices that are not visible to VC will be ignored.
- For a given replicated device that has a matching datastore/RDM, SRM only cares about it if it is protected. For datastores, it has to be added to a protection group explicitly. For RDMs – the corresponding VM must reside on a replicated datastore which is a part of a protection group and this VM needs to be protected (SRM will put warnings if it is not).
- Can I run an SRM test with the site-to-site links disconnected?
Running a recovery plan test is supported with vSphere Replication and with some SRAs. Check with your SRA vendor to confirm is they support running a test with the inter-site links disconnected.
- Can I use SRM to meet my RTO of “x”?
Calculating a probable recovery time objective (RTO) is not simple as there are many variables that are unique to each customer's implementation. These are just guidelines as the only way to get a real feel for the recovery time is to try it in your environment. Running a test failover will also give you an idea of how long your recovery will take. Keep in mind that an actual failover or planned migration will be the most accurate.
So what influences the possible failover time? Here are a few of the major factors:
- Number of protected VMs
- Number of datastores being recovered (more relevant for array replication, fewer datastores/LUNs means less mount/unmount operations)
- The layout of datastores on the array (if using array-based replication grouping datastores into “device” or “consistency” groups equals fewer objects for SRM to process)
- Number of recovery sites hosts (number of ESXi hosts is an optimization axis, more hosts you have, faster you recover the VM’s, assume one host can easily power on well in excess of 100 VMs per minute if needed and if resource available)
- Resource utilization of recovery sites hosts (if hosts are running other workloads this will need to be factored in)
- PostPowerOn Guest Customization steps (if used, think network changes, each IP customization enabled VM will reboot twice so add-in that additional time to the total accumulated time)
The factors differ depending on if you are using array-based replication or vSphere Replication. When recovering VM’s via vSphere Replication this is software-based replication so there are no storage devices to remount. During the recovery, we simply reload the VM configuration and point it to the replicated VMDKs. This means with vSphere Replication there is no datastore mount steps to perform at the recovery site. So using vSphere Replication will mean in theory the recovery will be quicker than using say array replication with FC protocol. Keep in mind, this is only valid if vSphere replication fits your needs.
- What are the ports used by SRM?
This is dependent on the version of SRM. See below for details by version:
- I want to protect “x” software with SRM, will it work?
If your VM runs on an OS that is supported on vSphere and it can be powered off and back on without issue then it will be able to be recovered by SRM. From the VMs perspective that is all that happens to a VM when it is failed over or recovered by SRM. It powers off (either crashing in the event of a disaster or shutdown via VMtools in a planned migration) and then powers back on at the recovery site. Everything else related to replicating storage, placeholder VMs, etc is invisible to the VMs OS.
This isn’t to say that all VMs are able to be successfully recovered with SRM just that there are no specific restrictions to SRM that would cause it not to work with a particular application.
- What is vSphere Replication?
vSphere Replication is the industry’s first hypervisor-based replication, purpose-built for vSphere and Site Recovery Manager. vSphere Replication enables replication between sites at an individual virtual machine level and is managed directly in vCenter Server. With vSphere Replication, customers can deploy heterogeneous storage arrays across sites, reducing costs by using lower-end storage at the failover site.
- What RPO can I expect with vSphere Replication?
With vSphere Replication, users can select the replication schedule for each individual ESX host. The RPO can be selected from a range of 15 minutes to 24 hours.
- Are there any additional restrictions for using vSphere Replication?
vSphere Replication cannot be used in conjunction with VMs that are not powered on, vSphere Fault Tolerance, Virtual Machine templates, linked clones, and physical RDMs.
- Does Site Recovery Manager support discrete, asynchronous or synchronous replication?
Site Recovery Manager can support discrete, asynchronous and synchronous replication. See the Storage Partner Compatibility Matrix Compatibility Matrixes for vCenter Site Recovery Manager 6.0 to determine which storage replication adapters support which types of replication for a specific array.
- What is the purpose of the storage replication adapters?
The storage replication adapters translate generic commands generated by Site Recovery Manager for tasks such as querying replicated datastores and promoting replicated data stores into array-specific commands. They enable Site Recovery Manager to work with a variety of array types.
- Where can I find the current list of replication adapters and supported replication for Site Recovery Manager?
The Storage Partner Compatibility Matrix Compatibility Matrixes for vCenter Site Recovery Manager 6.0 includes a list of storage replication adapters that have passed VMware certification for Site Recovery Manager as well as the storage array and replication with which they are supported.
- Will new storage replication adapters be available in the future, and will they require a new release of Site Recovery Manager?
VMware continues to work with additional storage partners to help them develop new adapters for Site Recovery Manager. New adapters can be added and used at any time without requiring a new release of Site Recovery Manager. If you are interested in using Site Recovery Manager with replication solutions that are not currently supported, contact your storage vendor. Also, let VMware know about your request by contacting your VMware representative or reseller.
- Does the storage replication adapter need to be installed on the same server as Site Recovery Manager?
Can multiple storage replication adapters be used with Site Recovery Manager simultaneously?
Yes. Multiple replication adapters can be installed in Site Recovery Manager to enable it to communicate with multiple arrays simultaneously. Keep in mind that a VM with VMDKs stored on multiple arrays cannot be protected with SRM. All VM files must be located on the same array.
- Is Site Recovery Manager compatible with storage virtualization solutions?
Site Recovery Manager is designed to work with all devices that present themselves as storage targets and can replicate their underlying storage. Many storage- virtualization solutions can operate in this manner. For Site Recovery Manager to work with a given storage- virtualization device, a storage replication adapter must be available for that device. The Storage Partner Compatibility Matrix Compatibility Matrixes for vCenter Site Recovery Manager 6.0 includes a complete list of supported storage virtualization solutions.
- Does Site Recovery Manager support NFS arrays?
Yes. Site Recovery Manager supports NFS storage and replication.
- Does Site Recovery Manager monitor the status of replication?
Site Recovery Manager monitors the replication configuration to detect when replication is turned off for a datastore containing protected virtual machines, so that it can notify administrators.
- Does Site Recovery Manager support using consistency groups in the replication configuration?
Site Recovery Manager takes consistency groups into account, although support varies depending on storage vendor. Consult the storage vendor storage replication adapter readme for details.
- How does Site Recovery Manager run a test without actually failing-over storage?
The answer depends on the capabilities of the array. For some arrays, the storage replication adapter takes a snapshot or clone of the datastore replica and presents it to the vSphere ESX hosts to use for testing. For other arrays, it halts replication temporarily to do testing.
- Can we write our own storage replication adapter?
VMware supports configurations that use storage replication adapters written by storage partners only. Storage partners who wish to write a new adapter should contact VMware directly.
- Who provides support for Site Recovery Manager deployments?
Problems that appear to be caused by Site Recovery Manager should be directed to VMware support. Problems that appear to be caused by the replication software, storage replication adapter or storage array should be directed to the appropriate storage partner’s support services. VMware and the vendors who provide replication adapters have cooperative support agreements in place to ensure that support requests can be coordinated between VMware and the storage partner.
- Where should we ask additional questions about whether Site Recovery Manager works with software from storage partners?
VMware publishes a list of currently supported storage and replication in the Storage Partner Compatibility Matrix. Remaining questions should be directed to the appropriate storage vendor.
- Where can I find a list of recommendations or best practices for Site Recovery Manager?
There is a white paper located here that covers Site Recovery Manager best practices in regards to performance/reduced RTO.
- What applications can I protect with Site Recovery Manager?
Any application that is supported on vSphere is supported for protection with Site Recovery Manager. That said there are some things that shouldn't be protected with Site Recovery Manager. This blog has the details.