VMware vSAN for Azure VMware Solution

Overview

VMware vSAN is the default storage platform for the Azure VMware Solution (AVS) private cloud. Local storage from each host within a vSphere cluster is pooled together to create a single, cluster-wide, vSAN datastore. Virtual machine files are distributed across hosts to ensure availability. vSAN is provisioned automatically during the private cloud deployment, or addition of a new cluster to an existing private cloud, and managed from the vSphere Web Client. It integrates core vSphere features including vMotion, HA, and DRS. vSAN delivers enterprise-class features, scale and performance, making it the ideal storage platform for VMs.

At the time this document was published, AVS private cloud clusters are using vSAN 7 Update 3c Enterprise with on-disk format v10. vSAN Enterprise licensing is included with the service.

vSAN Clusters

Default Configuration

Microsoft manages the vSAN cluster configuration, and customers do not have the ability to modify the configuration.

Enabled by default:

  • Space efficiency services (deduplication and compression)
  • Data-at-rest encryption
    • Provider managed, encryption keys are created at the time of deployment and stored in Azure Key Vault.
    • If a host if removed from a cluster, the data on the local devices is invalidated immediately.
    • Customers have the option to manage their own keys after deployment.

Disabled by default:

  • Operations reserve
  • Host rebuild reserve
  • vSAN iSCSI Target Service
  • File Service

Capacity

AVS offers three different host types; in addition to having different CPU and RAM specs, they also have different storage footprints. vSAN datastore capacity is increased by adding additional hosts to the cluster.

A picture containing text, font, screenshot

Description automatically generated

The total amount of storage per host is shown as raw capacity, for example, a 3 host AV36 cluster provides  46.2 TB of raw capacity. The total usable capacity will vary based on a number of variables such as RAID, Failures to Tolerate (FTT), compression, deduplication, thin vs thick provisioning, and slack space.

Slack Space

Slack space is the amount of storage space recommended to keep available for vSAN operational tasks and host failures/rebuilds. Not only will vCenter alarms be triggered, but Microsoft will also provide alerts and metrics via Azure Monitor when consumption of the vSAN datastore is greater than 75%. Microsoft requires customers to maintain 25% slack space to guarantee the AVS SLA.

Storage Policies

VMware vSAN leverages storage policy based management (SPBM) for VM placement, availability, and performance. Storage policies can be applied to multiple VMs, single VMs, and even single VMDKs. This allows you to apply specific attributes to multiple VM objects and disks. For example, we can apply RAID-1 mirroring to the VM as a whole, but apply RAID-5 erasure coding to a specific disk (VMDK) of the VM.

VMware and Microsoft have built a number of pre-defined storage policies for customer consumption. These policies cannot be modified or deleted, but new policies can be created. The pre-defined storage policies are well thought out and should meet the needs of most use cases.

Management Storage Policy

The default policy for all of the SDDC management VMs—including the vCenter Server Appliance, NSX-T Manager, and NSX-T Edges—is labeled Microsoft vSAN Management Storage Policy with the following configuration

Graphical user interface, application

Description automatically generated

Default Storage Policy

If you’re familiar with vSAN, you know there is a vSAN Default Storage Policy. In AVS, this does exist, however it is NOT the default storage policy applied to the cluster. This policy exists for historical purposes only, and this is where confusion may set in. This is a vSAN Default Storage Policy … policy.

Upon closer inspection you’ll notice that the Object space reservation is set to thick provisioning.

Graphical user interface, text, application

Description automatically generated

The actual default storage policy setting for this vSAN cluster is set to RAID-1 FTT-1, with Object space reservation set to Thin provisioning.

 

Graphical user interface, application

Description automatically generated

The difference between the two is one is set to thick provisioning, and one is set to thin provisioning.

In a 3 host cluster, this policy enables RAID-1 mirroring and protects VMs against a single host failure. However, this policy also requires double the storage per virtual machine.

Configuring Storage Policies

As hosts are added to the cluster, it’s recommended to change both the default storage policy and VM disk policies to ensure appropriate capacity utilization, availability, and performance for each virtual machine based on the number of hosts provisioned.

It’s important to understand that storage policies are applied during initial VM deployment, or during specific VM operations – cloning or migrating. The default storage policy cannot be changed on a VM after it’s deployed, but the storage policy can be changed per disk.

In most cases customers should be using RAID-5 or RAID-6 with FTT set to 1 or 2 depending on their cluster size. If you start off with a 3 host cluster, the policy will default to RAID-1 FTT-1. If you know that you’ll expand the cluster in the near future, I recommend deploying with 4-6 hosts from the start. The table below shows the different RAID configurations with FTT and the minimum required hosts for each.

Note: A storage policy for RAID-0 is not available out of the box, but can be created. Any storage policy created and used with no data redundancy options is not covered under the Microsoft SLA.

RAID

FTT

Minimum Hosts

RAID-0

0

3

RAID-1 (Default)
 

1

3

RAID-5

1

4

RAID-1

2

5

RAID-6

2

6

RAID-1

3

7

Run Commands

Azure Run commands allow admins to perform tasks via the Azure portal that they don’t have privileges to perform directly in the vSphere Client. As it relates to storage policies, there are seven Run commands.

  • Get-StoragePolicies
    • Lists all the vSAN storage policies available to apply to a VM.
  • New-AVSStoragePolicy
    • Creates a new, or overwrites an existing, storage policy.
  • Remove-AVSStoragePolicy
    • Removes a storage policy.
  • Set-ClusterDefaultStoragePolicy
    • Allows an admin to set the default storage policy for a cluster.
  • Set-LocationStoragePolicy
    • Allows an admin to modify the storage policy for all VMs in a specific cluster, resource pool, or folder.
  • Set-VMStoragePolicy
    • Allows an admin to modify the storage policy on VMs sharing the same name (example: “Web*”)
  • Set-vSANCompressDedupe
    • Sets compression and deduplication on vSAN cluster(s).
    • Deduplication and compression are already enabled on the vSAN cluster by default. The admin can choose to disable both, or just disable deduplication, leaving only compression enabled. If deduplication is enabled, compression is always enabled along with it.

To perform these tasks, simply login to the Azure portal, and navigate to your AVS Private cloud. Select Run command > Microsoft.AVS.Management (under Packages).

A screenshot of a computer

Description automatically generated with medium confidence

Monitoring and Alerts

Shared Responsibility

Customers are responsible for applying appropriate storage policies to their VMs, and adding hosts to maintain adequate vSAN slack space.

Microsoft is responsible for:

  • Compute / Network / Storage
    • Rack, power, bare metal hosts, and devices
  • SDDC Lifecycle
    • ESXi deployment, patching and upgrades
    • vCenter Server deployment, configuration, patching, and upgrades
    • vSAN deployment, configuration, patching, and upgrades
  • SDDC Compute
    • Cluster configuration
    • Virtual networking for vMotion, Management, vSAN, etc.
  • SDDC Health
    • Monitoring and corrective actions
    • Replacement of failed hosts

For more information pertaining to responsibility, please refer to the AVS Shared Responsibility Model.

Real-time Health Status

The vSAN health service is used to monitor the health status of the cluster. Here you can check the status of cluster components, capacity, performance and more.

A screenshot of a computer

Description automatically generated

A screenshot of a computer

Description automatically generated with medium confidence

Azure Alerts

It’s recommended that customers configure Azure health alerts to receive notifications of triggered events that administrators can define. Refer to the Microsoft documentation for manually configuring Azure Action Groups, Azure Alerts, and Azure Monitor Metrics, or you can use the Microsoft Enterprise-Scale for AVS ARM templates for Monitoring to automate much of the configuration.

In the video below, we demonstrate how to use the AVS ARM templates to setup basic monitoring of the Azure VMware Solution private cloud.

Customers are encouraged to leverage vRealize Operations for more in-depth metrics, alerting, and capacity planning. In the video below, we demonstrate creating an Azure application registration and using that to add an Azure VMware Solution private cloud as an endpoint in vRealize Operations.

Summary and Additional Resources

Additional Resources

For more information about vSAN, you can explore the following resources:

Changelog

The following updates were made to this guide:

Date

Description of Changes

2023/05/23

  • Added RAID-0 to Storage Policies.

2023/05/01

  • Guide was published.

About the Author and Contributors

  • Jeremiah Megie, Principal Cloud Solutions Architect, Cloud Services, VMware

Filter Tags

Operations and Management vSAN Azure Services Storage Azure VMware Solution Document Technical Overview Manage