Deploy Tanzu Kubernetes Grid on VMC on AWS
Overview
The Tanzu Kubernetes Grid (TKG) solution enables you to create and manage Kubernetes clusters across multiple infrastructure providers such as VMware vSphere, AWS, and Microsoft Azure using the Kubernetes Cluster API.
TKG functions through the creation of a management Kubernetes cluster that houses the Cluster API. The Cluster API then interacts with the infrastructure provider to service workload Kubernetes cluster lifecycle requests.
Scope
This document describes the steps to deploy a TKG Management and Workload cluster and how to host a sample Kubernetes application in the workload cluster. The document also describes the communication that happens between the TKG components and the NSX Advanced Load Balancer to create Kubernetes objects.
Prerequisites and Considerations
The table below lists the use cases and deployment considerations for TKGm implementation in VMC.
Pre-requisites |
|
General Considerations/Recommendations |
|
Deployment Requirements
Network requirements |
Create 2 DHCP-enabled logical segments, (one for the TKG Management and one for the TKG Workload) in your SDDC. Make sure that the new subnet CIDR does not overlap with the existing segments. An example is shown below:
|
|||||||||||||||
Port & Protocols Requirement |
To understand the firewall requirements for a successful TKG implementation in VMC, refer to this Firewall rules for TKG on VMC on AWS article. |
|||||||||||||||
Licensing Requirement |
A minimum NSX ALB basic license for TKG deployment is required. However, an enterprise license is also supported. |
Deploy TKG Management Cluster
You can deploy management clusters in two ways:
- Run the Tanzu Kubernetes Grid installer, a wizard interface that guides you through the process of deploying a management cluster.
- Create a deployment YAML configuration file and use it to deploy the management cluster with the Tanzu CLI commands.
The UI installer is an easy way to deploy the cluster, the following steps describe the process.
To launch the UI installer wizard, run the following command on the bootstrapper machine:
# tanzu management-cluster create –ui –bind <bootstrapper-ip>:8080 –browser none
You can the access the UI wizard by opening a browser and entering http://<bootstrapper-ip>:8080/
Note: If you see a “connection refused” error, make sure that you have allowed port 8080 in the firewall that is running on your bootstrapper machine.
From the TKG Installation user interface, you can see that it is possible to install TKG on vSphere (including VMware Cloud on AWS), AWS EC2, and Microsoft Azure.
Figure 1 - TKG installer user interface
Step 1 - To deploy the TKG Management Cluster in VMC on AWS, click the deploy button under "VMware vSphere".
Step 2 - On the "IaaS Provider" page, enter the IP/FQDN and credentials of the vCenter server where the TKG management cluster will be deployed.
Figure 2 - IaaS Provider
Step 3 - Click Connect and accept the vCenter server SSL thumbprint.
Figure 3 - vCenter SSL Thumbprint
Step 4 - If you are running a vSphere 7.x environment, the TKG installer will detect it and provides the user a choice to deploy either vSphere with Tanzu (TKGS) or the TKG management cluster.
Select the "Deploy TKG Management Cluster" option.
Figure 4 - vSphere Environment Detection
Step 5 - Select the Virtual Datacenter and enter the SSH public key that you generated earlier.
Figure 5 - vCenter Server Details
Step 6 - On the Management cluster settings page, select the instance type for control plane node and worker node and provide the following information:
- Management Cluster Name: Name for your management cluster.
- Control Plane Endpoint: A free IP from the network created for TKG management. Ensure that the IP address that you provide is not part of the DHCP range configured on the network.
Figure 6 - Management Cluster Settings
Step 7 - On the NSX Advanced Load Balancer page, provide the following:
- NSX ALB Controller IP address (ALB Controller cluster IP if the controller cluster is configured)
- Controller credentials.
- Controller certificate.
Step 8 - Click Verify and select the following:
- Cloud Name
- SE Group name
- VIP Network
- VIP Network CIDR
Optionally provide labels for your deployment.
Figure 7 - NSX ALB Details
Step 9 (optional) - On the Metadata page, you can specify location and labels.
Figure 8 - Metadata Details
Step 10 - On the Resources page, specify the compute containers for the TKG management cluster deployment.
Figure 9 - vSphere Resources Detail
Step 11 - On the Kubernetes Network page, select the Network where the control plane and worker nodes will be placed during management cluster deployment.
Optionally you can change the Service and Pod CIDR to the custom values.
Figure 10 - Kubernetes Network Settings
If you have LDAP configured in your environment, refer to the VMware Documentation for instructions on how to integrate an identity management system with TKG.
In this example, Identity management integration has been disabled.
Figure 11- Identity Management Details
Step 12 - Select the OS image that will be used for the management cluster deployment.
Note - that this list will appear empty if you don’t have a compatible template present in your environment. Once you have imported the correct template, you can come here and click on the refresh button and the installer will detect the image automatically.
Figure 12 - K8 Image Selection
Step 13 - If you have a subscription to Tanzu Mission Control and want to register your management cluster with the TMC, enter the registration URL here.
In this example, this step is skipped.
Figure 13 - TKG-TMC Integration
Step 14 (optional) – Check the “Participate in the Customer Experience Improvement Program”, if you so desire.
Figure 14 - CEIP Agreement
Step 15 - Click the Review Configuration button to verify your configuration settings.
Figure 15 - Configuration Review page
Step 16 - After you have verified the configuration settings, click Deploy Management Cluster to deploy the management cluster. Click EDIT CONFIGURATION to change the deployment parameters.
Note: Deployment of the management cluster can be also triggered from the CLI by using the command that the installer has generated for you.
Figure 16 - Deploy Management Cluster
When the deployment is triggered from the UI, the installer wizard displays the deployment logs on the screen.
Figure 17 - Management Cluster Setup Progress
Deployment of the management cluster takes about 20-30 minutes to complete. Close the installer wizard after the deployment is complete.
The installer will automatically set the context to the management cluster so that you can log in to it and perform additional tasks such as verifying the management cluster health and deploy the workload clusters etc.
Figure 18 - Management Cluster Setup Completed
Verify Management Cluster Health
After the management cluster deployment, run the below commands to verify the health status of the cluster:
# tanzu management-cluster get
# kubectl get nodes
# kubectl get pods -A
The following sample screenshots show what a healthy cluster looks like.
Figure 19 - Management Cluster Health Status
Figure 20 - Management Cluster Nodes & Pods
You are now ready to deploy the Tanzu Kubernetes Cluster aka workload cluster.
Deploy Tanzu Kubernetes Cluster (Workload Cluster)
The process of creating a Tanzu Kubernetes Cluster is similar to creating the management cluster. Follow the below commands to create a new workload cluster for your applications.
Step 1: Set the context to the management cluster
# kubectl config use-context <mgmt_cluster_name>-admin@<mgmt_cluster_name>
Step 2: Create a namespace for the workload cluster.
# kubectl create ns wld01
Step 3: Prepare the YAML file for workload cluster deployment
A sample YAML file is shown below for workload cluster deployment.
workload-cluster.yaml
CLUSTER_CIDR: 100.96.0.0/11
CLUSTER_NAME: mj-wld01
NAMESPACE: wld01
CLUSTER_PLAN: prod
ENABLE_CEIP_PARTICIPATION: "false"
OS_NAME: photon
ENABLE_MHC: "true"
IDENTITY_MANAGEMENT_TYPE: none
INFRASTRUCTURE_PROVIDER: vsphere
SERVICE_CIDR: 100.64.0.0/13
TKG_HTTP_PROXY_ENABLED: "false"
DEPLOY_TKG_ON_VSPHERE7: true
ENABLE_TKGS_ON_VSPHERE7: false
VSPHERE_CONTROL_PLANE_ENDPOINT: 192.168.18.110
VSPHERE_CONTROL_PLANE_DISK_GIB: "40"
VSPHERE_CONTROL_PLANE_MEM_MIB: "16384"
VSPHERE_CONTROL_PLANE_NUM_CPUS: "4"
VSPHERE_WORKER_DISK_GIB: "20"
VSPHERE_WORKER_MEM_MIB: "8192"
VSPHERE_WORKER_NUM_CPUS: "4"
VSPHERE_DATACENTER: /SDDC-Datacenter
VSPHERE_DATASTORE: /SDDC-Datacenter/datastore/WorkloadDatastore
VSPHERE_FOLDER: /SDDC-Datacenter/vm/TKG-Workload-VM's
VSPHERE_NETWORK: K8-Backend
VSPHERE_USERNAME: cloudadmin@vmc.local
VSPHERE_PASSWORD: <Fill-me-in>
VSPHERE_RESOURCE_POOL: /SDDC-Datacenter/host/Cluster-1/Resources/TKG-Workload
VSPHERE_SERVER: <Fill-me-in>
VSPHERE_INSECURE: true
VSPHERE_SSH_AUTHORIZED_KEY: <Fill-me-in>
You can change the deployment parameters in the yaml as per your infrastructure.
Step 4: Modify the NSX ALB Service Engine VM's
Service Engine should have layer-2 connectivity to the workload cluster so that the SE's can communicate with the applications that are deployed in the workload cluster. This is achieved via editing the SE VM's and attaching the first available free NIC to the logical segment where the TKG workload cluster will be deployed.
Figure 21 - Service Engine NIC's
Since you already have DHCP enabled on this segment, the SE VM's will be assigned an IP address from the DHCP pool. This can be verified by logging into the Controller UI and navigating to Infrastructure > Service Engine and editing the service engine settings and locating the Mac address that corresponds to the NIC that is attached to the workload cluster network.
Figure 22 - Verify Service Engine IP Address
Step 5: Initiate TKG Workload Cluster Deployment
Run the below command to start creating your first workload cluster
# tanzu cluster create tkg-wld01 –file=workload.yaml -v 6
After a successful deployment, the following log entries are displayed on your screen.
Deployment Log Expand source
Using namespace from config:
Validating configuration...
Waiting for resource pinniped-info of type *v1.ConfigMap to be up and running
configmaps "pinniped-info" not found, retrying
cluster control plane is still being initialized, retrying
Getting secret for cluster
Waiting for resource tkg-wld01-kubeconfig of type *v1.Secret to be up and running
Waiting for cluster nodes to be available...
Waiting for resource tkg-wld01 of type *v1alpha3.Cluster to be up and running
Waiting for resources type *v1alpha3.MachineDeploymentList to be up and running
Waiting for resources type *v1alpha3.MachineList to be up and running
Waiting for addons installation...
Waiting for resources type *v1alpha3.ClusterResourceSetList to be up and running
Waiting for resource antrea-controller of type *v1.Deployment to be up and running
Workload cluster 'tkg-wld01' created
Verify Workload Cluster Health
Once the cluster is deployed, you can run the below commands to verify the health of the cluster.
# tanzu cluster list
# tanzu cluster get <clustername> -n <namespace>
Figure 23 - Workload Cluster Health Status
Export the Workload Cluster Kubeconfig
Cluster Kubeconfig is needed to perform any operations against the workload cluster. Once a workload cluster has been deployed, the corresponding kubeconfig file can be exported using the below command:
# tanzu cluster kubeconfig get <workload-cluster-name> -n <workloadcluster-namespace> --admin --export-file <path-to-file>
Switch to the workload cluster context to start using it.
# kubectl config use-context <workload-cluster-name>-admin@<worklaod-cluster-namespace> --kubeconfig=<path-to-kubeconfig-file>
Install Avi Kubernetes Operator (AKO) in Workload Cluster
The Avi Kubernetes Operator (AKO) is a Kubernetes operator which works as an Ingress controller and performs Avi-specific functions in a Kubernetes environment with the Avi Controller. It runs as a pod in the cluster and translates the required Kubernetes objects to Avi objects and automates the implementation of ingresses/routes/services on the Service Engines (SE) via the Avi Controller. To know more about AKO, please refer to the NSX ALB official documentation.
Every workload cluster that you deploy should have an AKO pod running to leverage NSX ALB to create Virtual Services and VIP for VS. The AKO pod is not present by default on the workload cluster, even if you have passed NSX ALB details in the workload cluster deployment file.
AKO deployment is controlled via AKO config that needs to be applied manually on every workload cluster that you have deployed. Multiple workload clusters can share the same AKO configuration or can have a dedicated config if you are looking for isolation between the workload clusters.
To deploy AKO for workload cluster, follow the steps below:
Step 1: Prepare the yaml for AKO Configuration. A sample yaml is shown below.
deploy-ako.yaml
apiVersion: networking.tkg.tanzu.vmware.com/v1alpha1
kind: AKODeploymentConfig
metadata:
finalizers:
- ako-operator.networking.tkg.tanzu.vmware.com
generation: 2
name: ako-tkc
spec:
adminCredentialRef:
name: avi-controller-credentials
namespace: tkg-system-networking
certificateAuthorityRef:
name: avi-controller-ca
namespace: tkg-system-networking
cloudName: <cloud-name-in-ALB>
clusterSelector:
matchLabels:
<key>: <value>
controller: <ALB Controller IP>
dataNetwork:
cidr: <VIP Network CIDR>
name: <VIP Network name as configured in ALB>
extraConfigs:
image:
pullPolicy: IfNotPresent
repository: projects.registry.vmware.com/ako/ako
version: 1.3.1
ingress:
defaultIngressController: true
disableIngressClass: true
serviceEngineGroup: Default-Group
Step 2: Apply the AKO configuration by running the command:
# kubectl create -f deploy-ako.yaml
The above configuration updates the AKO Operator pod that is running in the management cluster. AKO Operator will keep an eye on the newly created workload cluster and its labels. if any workload cluster label matches with the label specified in the AKO config file, an AKO pod will be created in the workload cluster automatically.
Step 3: Create Avi namespace and label workload cluster for automated AKO installation.
Ako Pod List
# kubectl create ns avi-system --kubeconfig=/root/wld01-kubeconfig.yaml
# kubectl label cluster <workload-cluster-name> -n <namespace> <key>=<value>
example: kubectl label cluster tkg13-wld01 -n wld01 location=haas-lab
As soon as a matching label is provided to a workload cluster, you will see the creation of an AKO pod in the avi-system namespace.
# kubectl get all -n avi-system --kubeconfig=/root/wld01-kubeconfig.yaml
NAME READY STATUS RESTARTS AGE
pod/ako-0 1/1 Running 0 29h
NAME READY AGE
statefulset.apps/ako 1/1 29h
Deploy Sample Application
Now that you have got the AKO pod deployed for the workload cluster, it’s time to deploy a sample app of the type load balancer and verify that it creates objects (VS, VIP, Pool, etc) in NSX ALB and you can access the application.
A sample yaml is shown below for deploying a 'Yelb' application.
yelb.yaml
apiVersion: v1
kind: Service
metadata:
name: redis-server
labels:
app: redis-server
tier: cache
namespace: yelb
spec:
type: ClusterIP
ports:
- port: 6379
selector:
app: redis-server
tier: cache
---
apiVersion: v1
kind: Service
metadata:
name: yelb-db
labels:
app: yelb-db
tier: backenddb
namespace: yelb
spec:
type: ClusterIP
ports:
- port: 5432
selector:
app: yelb-db
tier: backenddb
---
apiVersion: v1
kind: Service
metadata:
name: yelb-appserver
labels:
app: yelb-appserver
tier: middletier
namespace: yelb
spec:
type: ClusterIP
ports:
- port: 4567
selector:
app: yelb-appserver
tier: middletier
---
apiVersion: v1
kind: Service
metadata:
name: yelb-ui
labels:
app: yelb-ui
tier: frontend
namespace: yelb
spec:
type: LoadBalancer
ports:
- port: 80
protocol: TCP
targetPort: 80
selector:
app: yelb-ui
tier: frontend
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: yelb-ui
namespace: yelb
spec:
selector:
matchLabels:
app: yelb-ui
replicas: 1
template:
metadata:
labels:
app: yelb-ui
tier: frontend
spec:
containers:
- name: yelb-ui
image: docker.io/yelb/yelb-ui:v1
ports:
- containerPort: 80
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: redis-server
namespace: yelb
spec:
selector:
matchLabels:
app: redis-server
replicas: 1
template:
metadata:
labels:
app: redis-server
tier: cache
spec:
containers:
- name: redis-server
image: docker.io/yelb/yelb-redis:v1
ports:
- containerPort: 6379
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: yelb-db
namespace: yelb
spec:
selector:
matchLabels:
app: yelb-db
replicas: 1
template:
metadata:
labels:
app: yelb-db
tier: backenddb
spec:
containers:
- name: yelb-db
image: docker.io/yelb/yelb-db:v1
ports:
- containerPort: 5432
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: yelb-appserver
namespace: yelb
spec:
selector:
matchLabels:
app: yelb-appserver
replicas: 1
template:
metadata:
labels:
app: yelb-appserver
tier: middletier
spec:
containers:
- name: yelb-appserver
image: docker.io/yelb/yelb-appserver:v1
ports:
- containerPort: 4567
To deploy the application run the command: # kubectl create -f yelb.yaml
Listing the pods in the yelb namespace returns the status of pods that have been created as part of the yelb application deployment.
Yelb Pod List
# kubectl get pods -n yelb --kubeconfig=/root/wld01-kubeconfig.yaml
NAME READY STATUS RESTARTS AGE
redis-server-576b9667ff-52btx 1/1 Running 0 8h
yelb-appserver-7f784ccd64-vtxlx 1/1 Running 0 8h
yelb-db-7cdddcff5-km67v 1/1 Running 0 8h
yelb-ui-f6b557d47-v772q 1/1 Running 0 8h
Verify NSX ALB Objects
Login to NSX ALB and verify that a VS and VIP have been created for the yelb application.
Virtual Service
VIP
Server Pool
On hitting the VIP created for the yelb application, verify that you can see the Yelb dashboard.