A successful Day 2 Operations strategy requires a feedback loop fed by metrics relating to workload, operational, and platform health. VMware and Azure each have solutions for collecting and surfacing that data
Azure native services can be used to monitor and manage Azure VMware Solution VMs. These include:
- Log Analytics to store and query log data
- Azure Security Center to provides a unified infrastructure security management system across hybrid workloads in the cloud or on premises
- Azure Monitor to monitor guest operating system performance and discover and map application dependencies for Azure VMware Solution or on-premises VMs.
- Azure Arc to extend Azure management to any infrastructure, including Azure VMware Solution, on-premises, or other cloud platforms. A more integrated Arc for Azure VMware Solution experience is currently in preview.
- Azure Update Management to manage operating system updates for your Windows and Linux machines in a hybrid environment.
VMware Aria Operations supports monitoring and management across private, hybrid and multi-cloud environments through a unified platform. VMware Aria Operations provides continuous performance insight and optimization, efficient capacity and cost management modeling, simplified compliance workflows, and a Troubleshooting Workbench for expedited root cause analysis.
Consider your broader cloud strategy when choosing which set of tooling to adopt. If Azure is your single preferred cloud platform and you expect to refactor workloads to integrate with Azure platform services, the Azure native management tools may be the best fit. If Azure is one cloud of many that you will support, VMware Aria Operations can provide a consistent experience across clouds.
The Very Least You Should Do
If you have not yet made a decision on your multi-cloud monitoring strategy prior to deploying an Azure VMware Solution private cloud, we recommend implementing some basic Azure platform monitoring features to ensure visibility of the most critical metrics and notifications on issues and events that might impact availability. These include:
- Create action groups that can be used to send email, SMS, push notifications, or voice calls to operations teams as appropriate
- Configure Azure Service Health to send alerts to action groups for service issues, planned maintenance, and other events that could impact Azure VMware Solution and other services
- Configure alerts in Azure Monitor to provide warnings if the cluster nears critical values for disk, CPU, or RAM usage. Remember that to qualify for service level agreements, AVS requires slack space of 25% available on vSAN datastores.
- Send AVS logs to Log Analytics
- Create an operational dashboard to surface metrics in an easily consumable and shareable way
Let’s take a closer look at what that template deploys.
Azure Core Setup
The Azure core setup tab of the deployment template asks you to select the subscription and region your private cloud is deployed into. Note that this template requires that you have contributor access to the subscription selected.
You are also asked to provide a deployment prefix. Be sure to choose a unique, identifiable value. All resources created by the template will have this value and a dash prepended to built-in names. If you provide the value “AVS” resources will be created in the form of “AVS-SDDC-Dashboard.” A resource group will be created to host deployed resources, named DeploymentPrefix-Operational.
Monitoring and Alerts
On the Monitoring deployment tab, we can select to deploy the Monitoring module. If we opt to do so, two sections appear—AVS Workbook and AVS Dashboard and Alerts.
The AVS Workbook option will deploy an that includes a Dashboard, alerts, syslog, VM metrics, and quota information. Much of this workbook’s functionality requires Azure Arc for Azure VMware Solution, which is currently in Preview and only available in East US and West Europe. For this reason, we do not recommend deploying the AVS Workbook.
The AVS Dashboards and Alerts option will allow you to opt into enabling Service Health alert rules, AVS alert rules, notifications, and a Dashboard.
An Action group will be created to allow email notification to the email address you specify. Service Health alert rules will be configured for the region selected on the Azure core setup tab, for the event types Service issues, Planned maintenance, Health advisory, and Security advisory.
A set of AVS Alert rules will be configured:
- Severity 2 Warning alerts for CPU, Memory, and Storage usage at the thresholds you specify, with a default set to > 60%.
- Severity 0 Critical alerts for average CPU usage above 80%, average memory usage above 80%, and disk usage above 75%
A dashboard will be created with panels displaying CPU, memory, and disk utilization over time.
If choosing Log Analytics, we can choose to send AVS Logs and Subscription activity to a new or existing Log Analytics workspace. If choosing a Storage account, we can create a new or choose and existing storage account and set the number of days of logs to maintain.
In this basic monitoring scenario, we recommend sending logs to a Log Analytics workspace. Log Analytics now provides several out-of-the-box queries for Azure VMware Solution logs that can be used as-is or customized for your needs.
These settings provide a base level of visibility and alert notification for an Azure VMware Solution deployment. We strongly recommend building on this foundation by integrating AVS into a broader enterprise monitoring and management platform, such as VMware Aria Operations.
Resources to Learn More
To learn more about Day 2 Operations on Azure VMware Solution: