Well-Architected Design: Cloud Migration Preparation
Introduction
Migrating IT applications and workloads to the cloud is a complex undertaking for organizations, which requires diligent planning and preparation. A successful cloud migration project can be divided into multiple phases. One of the early phases involves discovery and assessment of the source environments. Source environments of an organization are typically on-premises datacenters but could also be any type of cloud. While the considerations described in this design are applicable to “any-cloud” to "any-cloud” migrations, emphasis is placed on on-premises to cloud migrations. Finally, this design describes how to successfully perform a discovery and assessment of an organization’s source environment(s) for a cloud migration.
Cloud Migration Overview
Migration of workloads can be a complex task. The following design outlines the overview and defines the phases of a cloud migration project. It provides context to understand where cloud migration discovery and assessment fits in to the overall cloud migration process.
Definition of “Cloud Migration”
For the overall context, it is important to understand how VMware envisions a cloud migration effort as part of the Well-Architected Design framework. Two aspects are especially important in this definition:
- The combinations of source and target environments of a cloud migration is not restricted to “on-premises-to-cloud.” Essentially each source and target environment can be “any cloud.” So, cloud migrations are intended to be “any-cloud” to “any-cloud” including several VMC clouds, and on-premises environments (either virtual or even physical).
- A cloud migration is not considered as a one-time effort. Once an organization has relocated part of their applications or workloads to the cloud, it will face the challenge of regularly monitoring the operational efficiency of the now hybrid IT resources. For example, after migrating application X from an on-premises datacenter to cloud provider A, the driving factors behind that decision may change over time or even become invalid, resulting in a renewed decision to move the application to another cloud provider B or even back to on-premises datacenters.
As a result, when running a hybrid environment, cloud migrations become a continuous type of operation. It is critical for IT operations of an organization to constantly monitor the operational efficiency and re-evaluate from time to time whether the underlying technology and platform of an application are supporting the business in the best possible way.
Considering this definition of a “cloud migration” the VMware Aria Suite is intended to assist an organization with the complete lifecycle of its cloud migration journey throughout all phases.
Phases of a Cloud Migration
Figure 1 – Cloud Migration Overview
As illustrated in Figure 1 – Cloud Migration Overview a cloud migration consists of multiple phases.
Assessment and Discovery
This is the first phase of a cloud migration journey, which will be detailed in this design. Once an organization has defined its high-level migration strategy and goals, it must discover and thoroughly examine the source environments and applications. The purpose is to identify the scope of the migration project and gain an in-depth awareness of the current applications and workloads in the source environments. This should include understanding application logical and organizational dependencies, size requirements and total cost of ownership (TCO) implications. This phase concludes with a grouping of applications and initial scope of the migration effort.
Planning
The planning phase will build upon the outcomes and result of the “Assessment and Discovery” phase. The applications discovered in the earlier phase are further curated and their logical grouping is fine-tuned based on different organizational and technical criteria. Necessary line-of-business approvals from application owners are obtained, and target cloud(s) are sized and designed. One of the most important outcomes for this phase is a migration wave plan, which essentially is a timeline of the overall migration effort grouped into multiple waves.
Modernize, Optimize and Secure
This phase does not follow the “Planning” phase in a sequential way, but rather goes hand in hand with it. A cloud migration project is often seen by organizations as a unique chance to modernize their applications. For example, when moving to a cloud provider, organizations may consider replacing an existing on-premises SQL database server with a cloud-native service of a relational database. Similarly, they may consider containerizing some of their applications by leveraging tools that are found in VMware Tanzu.
An important part of this phase is also to develop a security plan for the target environment in the cloud, as security guardrails set up in on-premises implementations may have to be adapted for the specifics of the target cloud environment.
Migrate
After the assessment and planning phases, the next step is to perform the migration. If relocating workloads is a fundamental part of the overall migration plan, tools like VMware HCX can be used. Based on the migration type, the tool configuration needs to be considered, such as using VMware HCX with network extensions versus cold migrations without network extensions. When performing the actual migrations, they are typically scheduled and automated using the wave planning from the “Planning” phase. During the actual workload relocation process, thorough monitoring of the progress should be established. There should also be migration reports generated outlining the progress of the overall migration. These can be used to track adherence to the maintenance windows and planned schedules.
Continuous Operations
After workloads have been migrated, day-2 operations must now include their new platform. Patch management, data backups, health checks, capacity management and all other procedures must all occur in a hybrid fashion. This means that the tools and processes should work independently from the environment where applications and workloads are hosted. Operational teams need to consider that the new target environment may be “multi-cloud”. This means that certain parts of applications may have been transferred to the cloud, while other parts still reside in an on-premises datacenter. Also, some applications or workloads may have been modernized by refactoring measures, while others have been migrated “as is”. Operational processes and tools need to take all this into account and operational teams must be trained and prepared to efficiently operate the new hybrid environment.
Summary
Cloud migration projects are complex, they need to be thoroughly planned and executed in all phases. For a successful cloud migration project, the following aspects are key to understand:
- Migration phases are not always executed sequentially.
For example, imagine where a few hundred or thousand workloads have been defined as “in-scope” for a migration. Part of them resides in the assessment phase, another part resides in the actual migration, yet another part is already in the phase of continuous operations and so on. - Cloud Migration is not usually a one-time effort but can constantly occur in an iterative fashion.
Once workloads have been migrated to the cloud or even when they are retained on-premises, several events may trigger a reassessment of workloads and applications. Their current architecture or operational efficiency and costs may no longer meet the requirements of the corresponding business, which they support. As a result, new migrations and adaptations to existing workloads will continue to happen, even when they have already migrated to the cloud in the past.
It is key for a cloud migration project to always ensure transparency and visibility of the status and progress of migration activities. It must always be clear in which phase of a migration an application or workload currently resides. To ensure a smooth migration experience, a consistent inventory must be established without any friction or partition across the various phases of a cloud migration.
Migration Assessment
At the beginning of a cloud migration project, organizations must get a clear view on the current IT assets that are potentially in scope of a cloud migration. As part of this assessment phase, a logical grouping of applications will be created along with a sizing analysis and cost modeling. This helps reveal TCO implications when migrating to the cloud. At the end of this phase an organization will have an almost accurate roadmap for its cloud migration project, which will be further fine-tuned in subsequent phases.
Discovery of Source Environments
Typically, the very first step of a cloud migration assessment is the discovery of the source environments. As previously detailed, cloud migrations can be “any-cloud” to ”any-cloud”. The source environment may exist in the cloud itself or in various on-premises datacenters. For the context of this document, we will focus on on-premises datacenters to streamline the discussion. However, the basic assessment principles described in the following sections remain valid irrespective of the type of source environment.
Depending on the size of an organization, a range of source environments could potentially be in scope for a cloud migration. They may originate from multiple datacenters in different regions or locations. Additionally, the IT resources in scope can be either virtual or physical.
While the initial scope of a migration project may be limited to a certain environment, it may be decided to not limit the discovery to that same scope. Increasing the discovery scope may help build a more comprehensive inventory of IT environments. One of the first decisions to make before beginning the discovery of IT resources, is which environments and datacenters should the upcoming discovery consist of.
Design Consideration | Design Justification | Design Implication |
Define the IT environments, which shall be in scope for a discovery of IT assets | Supports the definition of a cloud migration scope |
|
Once the scope of the discovery has been defined, the environments discovery can take place. Typically, the information to be gathered during a migration assessment discovery, will already exist across various data sources, for example:
- System administration tools (e.g. vCenter Server for VMware virtualization environments)
- Operations Management software (e.g. Aria Operations or Microsoft System Center – Operations Manager)
- Management Software for physical server hardware (e.g. HP Systems Insight Manager or Dell OpenManage)
- Other Inventory utilities or software (e.g. RVTools utility for VMware environments).
- The IT operations team of an organization typically has multiple sources of data already for their various IT environments.
Not all tools are designed to assist in scoping, planning, executing, and monitoring a cloud migration lifecycle end-to-end. They may often contain the same or overlapping IT assets, but collect different metadata about these assets and cannot be freely extended or used for the purpose of a cloud migration project. IT project resources tasked with the planning and execution of a cloud migration project, often find themselves extracting and consolidating systems information from all the above-mentioned data sources to a “migration database” of their choice. This can cause friction across the multiple phases and multiple teams during a cloud migration project lifecycle.
It is key in the discovery phase to establish a consolidated database of the servers, applications, and workloads, which support the activities within a cloud migration project end-to-end from the initial discovery phase over subsequent planning phases, the migration itself up to day-N-operations.
A cloud migration database inventory should be capable to deliver the following functions:
- Host a consolidated inventory of IT systems, workloads, and applications to support the cloud migration lifecycle end-to-end.
- Connect and import from various data sources and management software.
- Manually import IT assets.
As source environments of a cloud migration may be heterogeneous, a cloud migration database should have the capability to manually import IT resources. In the case of security sensitive or isolated environments, it may not even be possible to connect to these in an automated fashion.
- Manually import IT assets.
- Gather relevant configuration data of workloads and applications such as their network configuration, VLAN assignment and associated network security posture.
- Determine and visualize network flows and dependencies between workloads.
- For subsequent migration planning it is very important to understand interdependencies between workloads. For example, two workloads which are closely interacting with each other, may need to be migrated together into a new cloud environment.
- Flexibly define inventory metadata (“tags”).
- Inventory metadata (“tags”) will serve as a criterion for grouping and scoping workloads and applications for cloud migration. Each organization or enterprise may have different requirements towards a categorization of workloads. For example, tags could reflect categories like “Production Environment,” “Test/Dev Environment,” “Business Critical Application,” or “On-prem”.
- Grouping and scoping of applications/workloads for a later cloud migration based on flexibly definable metadata.
- Comprehensively support the migration lifecycle.
It is important to always have full visibility, in which phase of the migration lifecycle a workload resides at any given point of time and what the operational status of this workload is.
A key goal of the discovery phase is to gain a solid inventory of the workloads and applications of the source environments, which supports subsequent migration planning and scoping exercises. Software like VMware Aria Migration Hub can meet these requirements.
Design Consideration | Design Justification | Design Implication |
Establish a consolidated dataset or database of IT workloads, which can store all required metadata and is able to flexibly group these assets according to the defined metadata tags. This entity becomes your “migration hub.” | Supports the definition of a cloud migration scope and provides transparency of your IT workloads across all phases of a cloud migration project | Establishing a proper migration database requires substantial upfront efforts. |
Logical Grouping of Applications/Workloads
After the discovery phase performed in the earlier step, the overview data of your IT workload environments is still unstructured. To have your inventory support upcoming migration scoping decisions you need to further structure it. To structure your workloads/applications they need to be logically grouped. The logical grouping can be performed based on varying different criteria. The assignment of workloads to a specific group can either be done in an automated, partly automated, or manual fashion.
The following table provides an overview of potential ways to group your applications and workloads:
Group Type | Description |
Network-based | A common way to group workloads is based on networking criteria. Usually, VLAN tags or IP subnets may be used as concrete grouping criteria. Depending on how networking is configured in your environment the assignment of workloads to networks (VLANs or IP subnets) may allow for application grouping of your workloads. For example, database applications may reside in certain subnets in your environment. However, network-based grouping is not able to fully detect application-level dependencies. In a typical 3-tier application consisting of a webserver, an application server, and a database server, each server may be configured in different subnets but still belong to the same application. |
Applications-based | With application-based grouping you assign discovered workloads to applications. While this type of grouping may be partly automated by a network flow analysis tool like VMware Aria Operations for Network, manual adjustments are typically necessary. With application-based grouping you aim to reflect the dependencies of your workload in your later migration planning by keeping the workloads of an application in the same migration wave. |
Cluster-based | If you already have vSphere environments on-premises (or in the source cloud) cluster-based grouping may be useful, especially when you already have concrete environments in mind, which are subject for migration to a target cloud. |
Location/Datacenter-based | Depending on your plans for your on-premises datacenters it may be useful to group workloads based on location/datacenter, for example if you want to evacuate whole datacenter locations by migrating to a target cloud. |
Other (Tag-based) | By assigning freely definable tags to the discovered or manually imported entities of your migration hub you can introduce any kind of grouping criteria, which your business logic or migration project requires. You may for example define tags based on the following criteria:
|
Note that the above-listed group types need not be mutually exclusive and can be combined as needed.
Design Consideration | Design Justification | Design Implication |
Logically group your workloads and applications in your source inventory based on your technical as well as business requirements. | A logical grouping of workloads is required for a solid scope definition of your cloud migration. | The logical groups defined in this step will later serve as input during the actual migration planning phase for the final migration scopes as well as for the detailed definition of migration waves. |
Application Policies Profiling
With the logical grouping of workloads in the earlier step the foundation is laid for the classification of workloads to support cloud migration planning. Application Policies Profiling has a similar goal, to further detail the classification of workloads for cloud migration, but from an application-centric point of view.
The following categories are examples for assigning application policies:
- High Availability/Disaster Recovery:
- Defines special properties of an application with regards to high availability/disaster recovery requirements, for example: Does an application need to recover quickly from a site failure?
- Security Zone/Firewall Requirements:
- Defines properties, whether an application needs to adhere to a special security zone, e.g. a Demilitarized Zone (DMZ).
- System Operations/Monitoring Policy:
- Defines how a certain application is handled operationally. Are standard IT operations policies applied or is there any “special treatment” established?
- Regulatory Compliance Standards
- In this category it is defined whether an application must adhere to specific regulatory compliance standards, e.g. GDPR in the EU or HIPAA in the US. These standards can put regional constraints, where workloads or data can be hosted. This may influence cloud migration strategies and need to be assessed.
- Other Application Specific Criteria
- Depending on the specifics of the source environment there may be other application-specific criteria, which can influence cloud migration decisions and need to be assessed. Some examples are provided in the following:
- Applications may be running on newer vSphere environments with most up-to-date virtual hardware versions, which are not even supported in the target cloud environment.
- Applications are known to be planned for modernization/” refactoring” or on the other hand “retirement” and the effort required for relocation to a cloud based platform is not required.
- Applications need to stay on-premises for several reasons other than those already listed above.
- Depending on the specifics of the source environment there may be other application-specific criteria, which can influence cloud migration decisions and need to be assessed. Some examples are provided in the following:
Design Consideration | Design Justification | Design Implication |
Define application profiles in categories, which influence your cloud migration planning. Consistently assign these profiles to your applications. | Application Profiles must be assessed to appropriately configure attached services like backup, system monitoring in your cloud target environment. | Gathering application profiles causes extra effort. Staff involved in cloud migration planning must consult with application owners. |
Sizing Analysis
Once applications are logically grouped you already have an initial understanding about potential migration scopes and therefore you can model certain migration scenarios from a sizing perspective.
The sizing analysis completed in this step will answer the question, how many compute resources are needed to host workloads in a selected target cloud environment.
The following steps are necessary to efficiently perform a sizing analysis:
- Gather or estimate CPU, Memory, Disk I/O and Storage Capacity utilization of your source environments.
- Typically, this data can be retrieved from existing performance monitoring tools already implemented in an environment. If possible, import or connect this data to the migration hub to allow for an automated sizing recommendation of your target cloud.
If you want to simplify the analysis of current performance statistics for sizing purposes, you may assume your existing workloads to be “right-sized” already and just take the actual static configuration parameters in terms of CPU, memory, and disk configuration. While this approach may simplify the performance analysis of your source environments, it may lead to “conservative” estimates regarding your required compute resources and hence result in an oversize configuration of your target cloud.
- Typically, this data can be retrieved from existing performance monitoring tools already implemented in an environment. If possible, import or connect this data to the migration hub to allow for an automated sizing recommendation of your target cloud.
- Accommodate for future growth of your compute resources in the target cloud.
- Depending on your business strategy your workloads and applications hosted in the target cloud will be subject to significant growth. Your target cloud platform must accommodate these requirements. Define an appropriate time in the future to estimate your growth, e.g., plan your growth for the next 1-3 years.
- Plan for redundancy and spare capacity.
- The policies and profiles of your applications will define requirements towards redundancy and spare capacity in your target cloud. This redundancy may be required only on a single layer (e.g., Storage Capacity) or across all compute layers (CPU, Memory, Disk I/O and storage capacity). Additionally, certain forms of disaster tolerance may be required. A solution to address redundancy across all compute layers is “stretched clustering,” where CPU, Memory and Disk I/O resources are duplicated across datacenter sites or availability zones. Depending on the solution, this may have a large or small impact on your sizing requirements.
- Use sizing tools provided by your Cloud Service Provider.
- Your chosen cloud service provider may already offer sizing or assessment tools, which can provide accurate estimates regarding the required target cloud resources, taking your individual “workload mix” as input parameters for the calculation.
- Ensure the target cloud provider can meet your planned capacity and scalability requirements.
- The target cloud provider may have some capacity and scalability limits, sometimes even specific for a certain region or datacenter and even temporarily. Check the required resources at a selected location can be allocated at the correct point of time.
Design Consideration | Design Justification | Design Implication |
Size your target cloud resources based on your source cloud workload characteristics and your planned future growth | A properly sized target cloud environment will make sure to meet the performance/availability SLAs of your workloads without wasting resources by overprovisioning. | “Right sizing” a target environment by analysis of actual performance metrics of your source environment is a complex and iterative process. Sizing a target environment with static configuration parameters of your source environment is more straightforward but may lead to some overprovisioning of target cloud resources. |
Cost Modeling
Once you have grouped your applications and collected sizing data, you should consider the cost associated with a cloud migration. In calculating the cloud migration TCO, you should consider expenses beyond that of the destination environment. This is important as costs do not always translate well between source and destination clouds. For example, the procurement models are quite different between on-premises and public cloud. Consider the following cost related factors at the source cloud, at the destination cloud and migration related costs. (Although source can reference any-cloud, in this example the source cloud is an on-premises datacenter)
Source Cloud Costs – Running an application on-premises.
- Servers and Storage – Including maintenance and upgrades.
- Software – Ongoing licensing fees for software and support.
- Datacenter costs – Build or lease space, and any associated maintenance fees.
- Datacenter utilities – Heating and cooling.
- Workforce – Engineers and support personnel to maintain datacenter and infrastructure.
Destination Cloud Costs – Running the application in a destination public cloud.
- Hosting platform – This may be native public cloud or VMware cloud.
- Workforce – Engineers and support personnel to maintain the platform and application.
- Cloud Cost – Network egress charges, cross region network cost, additional native public cloud services required to run applications in the cloud.
Migration Costs – Costs associated with moving applications from source to destination cloud.
- Development – Expenses related to refactoring and re-architecting applications.
- Workforce – Project management, engineers and support personnel completing migration activities. These include planning, testing and execution of migration.
- Cloud Cost – Expenses from the cloud provider associated with migration. These include Infrastructure deployment, data transfer and egress charges.
- Migration Software – Costs related to dedicated tools to help with execution of migration.
Design Consideration | Design Justification | Design Implication |
Structure your migration costs into categories based on where the expense is incurred. (Source, destination, and migration costs) | Structing your costs by category will allow you to accurately compare costs across source and destination clouds. | Cloud costs do not always translate between clouds, this includes on-premises to public cloud or public cloud to public cloud. Accurately calculating TCO of your cloud migration is complex and requires substantial effort. |
Technologies in the Source Environment
The previous sections of this document discussed the assessment of your applications. It is also important to perform an assessment of additional technologies running in the source environment. When migrating an application to the cloud, the focus is typically on moving the workload and services from the source to destination. In addition to moving those workloads you should consider the surrounding technologies running at the source. This allows you to map technologies at the target cloud during migration planning. These technologies may not be critical to bringing the application on-line, but they may be critical in operating your infrastructure and meeting application SLA. The two categories, Feature and Product can be used to simply define these technologies.
Feature – This is defined as the capability of an existing core product in the source cloud.
Product – This is defined as a 2nd or 3rd party standalone tool that is used at the source.
Your source environment will likely have many elements from each category. Here we will provide a concrete example of each and considerations in your assessment.
Technology | Considerations |
VMware Stretched Cluster (Feature) | Stretched Clusters is a feature of VMware vSAN extending the cluster from a single site to two sites for higher levels of availability and inter-site load balancing. Stretched clusters are typically deployed in environments where the distance between data centers is limited, such as metropolitan or campus environments. If you are using Stretched Clusters in your source cloud for HA capabilities, before planning your migration you should assess the impact and how we can meet the requirements in the destination environment.
If this feature is not supported in your destination cloud, consider ways to meet the same HA requirements with different technologies. |
Backup Solution (Product) | Backup software is used to create copies of files, databases, and systems. These backups are used to recover from system failures, human errors and even cyberattacks. Your source environment is undoubtedly protected by backup technology with schedules, retention policies and possibly offsite copies. When assessing your cloud migration, you should consider the impacts on backups of moving workloads.
|
Your source environment may contain many technologies. The following list outlines some of the possible technologies to consider in your assessment.
Features | Products |
|
|
Design Decision | Design Justification | Design Implication |
Define Source environment technology dependencies which influence your cloud migration planning. | Source Environment dependencies must be assessed to ensure your destination cloud and products can meet your infrastructure requirements and SLAs. This assessment will allow you to map appropriate technologies at the destination cloud. | Modern IT infrastructure can have many dependencies on 2nd and 3rd party products. Assessing these source technologies will require knowledge across many platforms and collaboration with business owners. |