Cluster Conversion to the I4i Instance Type
With the launch of the i4i instance type for VMware Cloud on AWS, customers have asked how to benefit from the increased compute resources with the i4i instance. Recent performance studies have shown significant performance gains using i4i because of its memory footprint and the usage of the latest generation of Intel Xeon processors.
The full specifications of the i4i instance type, compared to the i3(en) instance, are found here. Customers can request to convert their software-defined data centers (SDDCs) in VMware Cloud on AWS from i3 to i3en or the new i4i instance type, meaning VMware helps to move your workloads from one hardware generation to a new generation easily, without any downtime because of live-migrations using vMotion. This conversion service is provided by VMware, there is no extra cost involved.
Host subscriptions are still applicable, so customers should have flexible subscriptions or create a new subscription for the i4i instance. Customers can contact their VMware representative to get the cluster conversion process requested and to help them with potential subscription questions. VMware will propose a conversion window, and customers have the opportunity to approve this timeslot.
It is important to understand that the cluster conversion does not change any purchased subscriptions.
To move from i3(en) to i4i nodes, there are specific requirements on the SDDC version of your VMware Cloud on AWS environments. It is required to run the 1.18v8 version at minimum for VMware to be able to convert the clusters in the SDDC. Like the example below, which is a screengrab from the Cloud Console, I'm still running SDDC version 1.18v5.
This means this SDDC is not yet eligible for cluster conversion to i4i. The easy solution here is to have the SDDC updated to 1.18v8. This will happen automatically at some point in time, as lifecycle management is taken care of by VMware for VMware Cloud on AWS. However, it is something that can be triggered to accommodate a cluster conversion. Talk to your VMware representative so updating to an SDDC version suitable for cluster conversion can be scheduled upfront. Or look into the self-service option as described here.
It is also good to know that both single AZ SDDCs and stretched clusters (multi-AZ) are supported for cluster conversion.
For customers running a newly deployed SDDC on VMware Cloud on AWS, running a 1.20 version, they need to be updated to 1.20v2, so cluster conversion to i4i is supported.
While VMware triggers an additional backup of the management plane as part of the conversion process, as explained later in this article, it is highly recommended that customers do the same for the workloads and data running on the cluster that is to be converted.
Once a cluster conversion is scheduled, the process in the background kicks in. The goal here is not to disrupt the performance or availability of customer workloads during the cluster conversion.
This is an SRE-triggered process, but the heavy lifting is done by the Rollout Lifecycle Management (RLCM) and the Release Coordination Engine (RCE) capabilities, both part of the underlying VMware Cloud on AWS service.
RLCM is an application that enables scheduling SDDC upgrades at scale. It automates the process of scheduling end to end - creating a rollout, segregating SDDCs into waves per various VMware Cloud on AWS and/or customer-defined constraints, and creating upgrade deployments in RCE. RCE is a VMware Cloud service that orchestrates maintenance-related workflows for all SDDCs in the fleet, like SDDC upgrades, migrations, and conversions. It can schedule and orchestrate workflows for thousands of SDDCs at scale.
Once a cluster conversion is scheduled, the customer will be notified via e-mail. To ensure the cluster conversion succeeds, various pre-checks are done in multiple iterations. The first pre-check run is at 72 hours before the actual conversion. Various health and capacity checks are tested to ensure the clusters are in good shape without any warnings or errors. The pre-checks run four times before the conversion starts. If at any point, an error or warning is raised, the SRE is notified so it can be remediated. If required, the SRE may contact the customer to assist with the remediation.
When everything checks out, the conversion task is started by the Autoscaler logic. There are 3 phases; pre-conversion, the conversion itself, and post-conversion tasks.
During the conversion phases by the Autoscaler logic, the task progression is carefully monitored. If something happens, an SRE will be alerted so they can troubleshoot if necessary.
All the necessary steps are taken for a successful conversion, making sure it is nondisruptive for customers. This includes disabling the vSAN de-duplication feature that is enabled by default on i3 nodes but is not for i3en and i4i instances. All vCenter Server alarms are silenced, and the customer is notified that the conversion process has begun.
The last thing to do before beginning with the actual conversion is temporarily disable the EDRS scale-in mechanism. We don't want EDRS to start scale-in operations during the conversion process as more hosts are added as part of the conversion. Next, at least two i4i hosts are added to the cluster, and both default NSX Edge virtual appliances are live-migrated to the new hosts by vMotion. When customers have more NSX Edge appliances, these will be migrated first.
The new hosts added to the cluster as part of the conversion process are 'non-billable'. This means that there won't be double the cost during the conversion because i4i hosts are added before the older i3(en) hosts are removed. Until the cluster conversion is complete, all hosts are billed at the original host rate. After the conversion is complete, billing switches to the target host rate.
Before adding and migrating to new i4i hosts, the Backup and Restore Service (BRS) API is invoked to create an a-sync backup of the SDDC management plane. BRS is an internal VMware Cloud service that is responsible for taking regular backups of the management plane (vCenter Server, NSX, etc.) of the entire SDDC fleet and providing interfaces to restore the SDDC to a backup in the case of an unrecoverable failure.
The next step is to add new i4i hosts, migrate workloads to the new hosts using vMotion, and swap out the old i3(en) hosts. This is a continuous cycle until we reach host convergence and all old hosts are replaced by i4i hosts. When done, EDRS is enabled again.
In the post-conversion process, the vSAN Disk Format Conversion (DFC) flag is reset, the alarms in the vCenter Server are enabled, and the cluster instance type is updated. Once this is finished, the customer is notified the conversion task has been successfully done.
Impact on Operation
As listed here on the docs.vmware.com website, too, there is no downtime to workload virtual machines or management appliances during the conversion process. However, you are unable to perform the following operations during cluster conversion:
- Removing hosts
- Editing EDRS policy settings
During cluster conversion, do not perform the following actions on the cluster which is being converted:
- Do not perform hot or cold workload migrations to or from the cluster being converted.
- Do not perform workload provisioning (New/Clone VM).
- Do not make changes to Storage-based Policy Management settings for workload VMs.
- Avoid starting HCX migrations that might overlap with the conversion window.
- Avoid the following DRaaS activity on the cluster being converted:
- Create or destroy site pairings
- Execute recovery plan
- Planned migration
- Test failover or test cleanup
- Real failover
- Replication management operations, such as configuring or stopping replication
- Do not add or remove hosts from the cluster being converted.
Do note that compute policy tags are not copied over during cluster conversion. You will need to attach host policy tags after the conversion is complete.
The process is designed to be as easy and non-disruptive to the customer as is the lifecycle management for the VMware Cloud on AWS service. If customers are logged into the vSphere Client, they will see hosts being added and removed, plus vMotion operations to move workloads to the newly added hosts.
If you have any questions about your SDDCs and want to convert i3(en) clusters to i4i clusters, reach out to your VMware representative.