Deploy Tanzu Kubernetes Grid on VMC on AWS

Overview

The Tanzu Kubernetes Grid (TKG) solution enables you to create and manage Kubernetes clusters across multiple infrastructure providers such as VMware vSphere, AWS, and Microsoft Azure using the Kubernetes Cluster API.

TKG functions through the creation of a management Kubernetes cluster that houses the Cluster API. The Cluster API then interacts with the infrastructure provider to service workload Kubernetes cluster lifecycle requests.

Scope

This document describes the steps to deploy a TKG Management and Workload cluster and how to host a sample Kubernetes application in the workload cluster. The document also describes the communication that happens between the TKG components and the NSX Advanced Load Balancer to create Kubernetes objects.

Prerequisites and Considerations

The table below lists the use cases and deployment considerations for TKGm implementation in VMC.

Pre-requisites

  • SDDC is deployed in VMC and outbound access to vCenter is configured.
  • Segments for NSX ALB (Mgmt & VIP) are created.
  • NSX ALB Controllers and Service Engines are deployed and controllers’ initial configuration is completed. Please refer to this article to understand how NSX ALB is deployed and configured in VMC on AWS.
  • The bootstrap environment is ready. Please refer to the TKG intro guide for instructions to setup a bootstrapper machine.

General Considerations/Recommendations

  • Deploy TKG management cluster and workload cluster on separate logical segments.
  • For network isolation, it is recommended to create new segments for each TKG workload cluster.
  • Ensure that the IP address that you will be using as cluster IP when deploying mgmt/workload cluster is excluded from the DHCP range configured on the network.
  • Ensure that the network where the TKG bootstrapper VM is connected can reach the TKG-Management & TKG-Workload network.
  • Create a dedicated folder and resource pools for the TKG Management VM’s and TKG workload VM’s for logical separation.
  • Deploy Service Engines VM's in Single-Arm mode.
  • Create Service Engine groups per workload cluster and deploy the corresponding Service Engine in the Service Engine group.

Deployment Requirements

Network requirements

Create 2 DHCP-enabled logical segments, (one for the TKG Management and one for the TKG Workload) in your SDDC. Make sure that the new subnet CIDR does not overlap with the existing segments.

An example is shown below:

Network Type

Segment Name

Type

CIDR

DHCP Pool

TKG Management

TKG-Management

Routed

192.168.17.0/24

192.168.17.101-192.168.17.250

TKG Workload

K8-Backend

Routed

192.168.18.0/24

192.168.18.2-192.168.18.99

 

Port & Protocols Requirement

To understand the firewall requirements for a successful TKG implementation in VMC, refer to this Firewall rules for TKG on VMC on AWS article.

Licensing Requirement

A minimum NSX ALB basic license for TKG deployment is required. However, an enterprise license is also supported.

Deploy TKG Management Cluster

You can deploy management clusters in two ways:

  • Run the Tanzu Kubernetes Grid installer, a wizard interface that guides you through the process of deploying a management cluster.
  • Create a deployment YAML configuration file and use it to deploy the management cluster with the Tanzu CLI commands.

The UI installer is an easy way to deploy the cluster, the following steps describe the process.

To launch the UI installer wizard, run the following command on the bootstrapper machine:

# tanzu management-cluster create –ui –bind <bootstrapper-ip>:8080 –browser none

You can the access the UI wizard by opening a browser and entering http://<bootstrapper-ip>:8080/

Note: If you see a “connection refused” error, make sure that you have allowed port 8080 in the firewall that is running on your bootstrapper machine.

From the TKG Installation user interface, you can see that it is possible to install TKG on vSphere (including VMware Cloud on AWS), AWS EC2, and Microsoft Azure.

Figure 1 - TKG installer user interface

Step 1 - To deploy the TKG Management Cluster in VMC on AWS, click the deploy button under "VMware vSphere".

Step 2 - On the "IaaS Provider" page, enter the IP/FQDN and credentials of the vCenter server where the TKG management cluster will be deployed.

 

Figure 2 - IaaS Provider

Step 3 - Click Connect and accept the vCenter server SSL thumbprint.

 

Figure 3 - vCenter SSL Thumbprint

Step 4 - If you are running a vSphere 7.x environment, the TKG installer will detect it and provides the user a choice to deploy either vSphere with Tanzu (TKGS) or the TKG management cluster.

Select the "Deploy TKG Management Cluster" option.

Figure 4 - vSphere Environment Detection

Step 5 - Select the Virtual Datacenter and enter the SSH public key that you generated earlier.

 

Figure 5 - vCenter Server Details

Step 6 - On the Management cluster settings page, select the instance type for control plane node and worker node and provide the following information:

  • Management Cluster Name: Name for your management cluster.
  • Control Plane Endpoint: A free IP from the network created for TKG management. Ensure that the IP address that you provide is not part of the DHCP range configured on the network.

Figure 6 - Management Cluster Settings

Step 7 - On the NSX Advanced Load Balancer page, provide the following:

  • NSX ALB Controller IP address (ALB Controller cluster IP if the controller cluster is configured)
  • Controller credentials.
  • Controller certificate.

Step 8 - Click Verify and select the following:

  • Cloud Name
  • SE Group name
  • VIP Network
  • VIP Network CIDR

Optionally provide labels for your deployment.

Figure 7 - NSX ALB Details

Step 9 (optional) - On the Metadata page, you can specify location and labels.

Figure 8 - Metadata Details

Step 10 - On the Resources page, specify the compute containers for the TKG management cluster deployment.

Figure 9 - vSphere Resources Detail

Step 11 - On the Kubernetes Network page, select the Network where the control plane and worker nodes will be placed during management cluster deployment.

Optionally you can change the Service and Pod CIDR to the custom values.

Figure 10 - Kubernetes Network Settings

If you have LDAP configured in your environment, refer to the VMware Documentation for instructions on how to integrate an identity management system with TKG.

In this example, Identity management integration has been disabled.

Figure 11- Identity Management Details

Step 12 - Select the OS image that will be used for the management cluster deployment.

Note - that this list will appear empty if you don’t have a compatible template present in your environment. Once you have imported the correct template, you can come here and click on the refresh button and the installer will detect the image automatically.

Figure 12 - K8 Image Selection

Step 13 - If you have a subscription to Tanzu Mission Control and want to register your management cluster with the TMC, enter the registration URL here.

In this example, this step is skipped.

Figure 13 - TKG-TMC Integration

Step 14 (optional) – Check the “Participate in the Customer Experience Improvement Program”, if you so desire.

Figure 14 - CEIP Agreement

Step 15 - Click the Review Configuration button to verify your configuration settings.

Figure 15 - Configuration Review page

Step 16 - After you have verified the configuration settings, click Deploy Management Cluster to deploy the management cluster. Click EDIT CONFIGURATION to change the deployment parameters.

Note:  Deployment of the management cluster can be also triggered from the CLI by using the command that the installer has generated for you.

Figure 16 - Deploy Management Cluster

When the deployment is triggered from the UI, the installer wizard displays the deployment logs on the screen.

Figure 17 - Management Cluster Setup Progress

Deployment of the management cluster takes about 20-30 minutes to complete. Close the installer wizard after the deployment is complete.

The installer will automatically set the context to the management cluster so that you can log in to it and perform additional tasks such as verifying the management cluster health and deploy the workload clusters etc.

Figure 18 - Management Cluster Setup Completed

Verify Management Cluster Health

After the management cluster deployment, run the below commands to verify the health status of the cluster:

# tanzu management-cluster get

 
# kubectl get nodes

 
# kubectl get pods -A

The following sample screenshots show what a healthy cluster looks like.

Figure 19 - Management Cluster Health Status

Figure 20 - Management Cluster Nodes & Pods

You are now ready to deploy the Tanzu Kubernetes Cluster aka workload cluster.

Deploy Tanzu Kubernetes Cluster (Workload Cluster)

The process of creating a Tanzu Kubernetes Cluster is similar to creating the management cluster. Follow the below commands to create a new workload cluster for your applications.

Step 1: Set the context to the management cluster

# kubectl config use-context <mgmt_cluster_name>-admin@<mgmt_cluster_name>

Step 2: Create a namespace for the workload cluster.

# kubectl create ns wld01

Step 3: Prepare the YAML file for workload cluster deployment

A sample YAML file is shown below for workload cluster deployment.

workload-cluster.yaml  Expand source

CLUSTER_CIDR: 100.96.0.0/11
CLUSTER_NAME: mj-wld01
NAMESPACE: wld01
CLUSTER_PLAN: prod
ENABLE_CEIP_PARTICIPATION: "false"
OS_NAME: photon
ENABLE_MHC: "true"
IDENTITY_MANAGEMENT_TYPE: none
INFRASTRUCTURE_PROVIDER: vsphere
SERVICE_CIDR: 100.64.0.0/13
TKG_HTTP_PROXY_ENABLED: "false"
DEPLOY_TKG_ON_VSPHERE7: true
ENABLE_TKGS_ON_VSPHERE7: false
VSPHERE_CONTROL_PLANE_ENDPOINT: 192.168.18.110
VSPHERE_CONTROL_PLANE_DISK_GIB: "40"
VSPHERE_CONTROL_PLANE_MEM_MIB: "16384"
VSPHERE_CONTROL_PLANE_NUM_CPUS: "4"
VSPHERE_WORKER_DISK_GIB: "20"
VSPHERE_WORKER_MEM_MIB: "8192"
VSPHERE_WORKER_NUM_CPUS: "4"
VSPHERE_DATACENTER: /SDDC-Datacenter
VSPHERE_DATASTORE: /SDDC-Datacenter/datastore/WorkloadDatastore
VSPHERE_FOLDER: /SDDC-Datacenter/vm/TKG-Workload-VM's
VSPHERE_NETWORK: K8-Backend
VSPHERE_USERNAME: cloudadmin@vmc.local
VSPHERE_PASSWORD: <Fill-me-in>
VSPHERE_RESOURCE_POOL: /SDDC-Datacenter/host/Cluster-1/Resources/TKG-Workload
VSPHERE_SERVER: <Fill-me-in>
VSPHERE_INSECURE: true
VSPHERE_SSH_AUTHORIZED_KEY: <Fill-me-in>

You can change the deployment parameters in the yaml as per your infrastructure.

Step 4: Modify the NSX ALB Service Engine VM's

Service Engine should have layer-2 connectivity to the workload cluster so that the SE's can communicate with the applications that are deployed in the workload cluster. This is achieved via editing the SE VM's and attaching the first available free NIC to the logical segment where the TKG workload cluster will be deployed.

Figure 21 - Service Engine NIC's

Since you already have DHCP enabled on this segment, the SE VM's will be assigned an IP address from the DHCP pool. This can be verified by logging into the Controller UI and navigating to Infrastructure > Service Engine and editing the service engine settings and locating the Mac address that corresponds to the NIC that is attached to the workload cluster network.

 

Figure 22 - Verify Service Engine IP Address

Step 5: Initiate TKG Workload Cluster Deployment

Run the below command to start creating your first workload cluster

# tanzu cluster create tkg-wld01 –file=workload.yaml -v 6

After a successful deployment, the following log entries are displayed on your screen.

Deployment Log  Expand source

Using namespace from config:
Validating configuration...
Waiting for resource pinniped-info of type *v1.ConfigMap to be up and running
configmaps "pinniped-info" not found, retrying
cluster control plane is still being initialized, retrying
Getting secret for cluster
Waiting for resource tkg-wld01-kubeconfig of type *v1.Secret to be up and running
Waiting for cluster nodes to be available...
Waiting for resource tkg-wld01 of type *v1alpha3.Cluster to be up and running
Waiting for resources type *v1alpha3.MachineDeploymentList to be up and running
Waiting for resources type *v1alpha3.MachineList to be up and running
Waiting for addons installation...
Waiting for resources type *v1alpha3.ClusterResourceSetList to be up and running
Waiting for resource antrea-controller of type *v1.Deployment to be up and running

 
Workload cluster 'tkg-wld01' created

Verify Workload Cluster Health

Once the cluster is deployed, you can run the below commands to verify the health of the cluster.

# tanzu cluster list

 
# tanzu cluster get <clustername> -n <namespace>

 

Figure 23 - Workload Cluster Health Status

Export the Workload Cluster Kubeconfig

Cluster Kubeconfig is needed to perform any operations against the workload cluster. Once a workload cluster has been deployed, the corresponding kubeconfig file can be exported using the below command:

# tanzu cluster kubeconfig get <workload-cluster-name> -n <workloadcluster-namespace> --admin --export-file <path-to-file>

Switch to the workload cluster context to start using it.

# kubectl config use-context <workload-cluster-name>-admin@<worklaod-cluster-namespace> --kubeconfig=<path-to-kubeconfig-file>

Install Avi Kubernetes Operator (AKO) in Workload Cluster

The Avi Kubernetes Operator (AKO) is a Kubernetes operator which works as an Ingress controller and performs Avi-specific functions in a Kubernetes environment with the Avi Controller. It runs as a pod in the cluster and translates the required Kubernetes objects to Avi objects and automates the implementation of ingresses/routes/services on the Service Engines (SE) via the Avi Controller. To know more about AKO, please refer to the NSX ALB official documentation.

Every workload cluster that you deploy should have an AKO pod running to leverage NSX ALB to create Virtual Services and VIP for VS. The AKO pod is not present by default on the workload cluster, even if you have passed NSX ALB details in the workload cluster deployment file.

AKO deployment is controlled via AKO config that needs to be applied manually on every workload cluster that you have deployed. Multiple workload clusters can share the same AKO configuration or can have a dedicated config if you are looking for isolation between the workload clusters.

To deploy AKO for workload cluster, follow the steps below:

Step 1: Prepare the yaml for AKO Configuration. A sample yaml is shown below.

deploy-ako.yaml  Expand source

apiVersion: networking.tkg.tanzu.vmware.com/v1alpha1
kind: AKODeploymentConfig
metadata:
  finalizers:
    - ako-operator.networking.tkg.tanzu.vmware.com
  generation: 2
  name: ako-tkc
spec:
  adminCredentialRef:
    name: avi-controller-credentials
    namespace: tkg-system-networking
  certificateAuthorityRef:
    name: avi-controller-ca
    namespace: tkg-system-networking
  cloudName: <cloud-name-in-ALB>
  clusterSelector:
    matchLabels:
      <key>: <value>
  controller: <ALB Controller IP>
  dataNetwork:
    cidr: <VIP Network CIDR>
    name: <VIP Network name as configured in ALB>
  extraConfigs:
    image:
      pullPolicy: IfNotPresent
      repository: projects.registry.vmware.com/ako/ako
      version: 1.3.1
    ingress:
      defaultIngressController: true
      disableIngressClass: true
  serviceEngineGroup: Default-Group

Step 2: Apply the AKO configuration by running the command:

# kubectl create -f deploy-ako.yaml

The above configuration updates the AKO Operator pod that is running in the management cluster. AKO Operator will keep an eye on the newly created workload cluster and its labels. if any workload cluster label matches with the label specified in the AKO config file, an AKO pod will be created in the workload cluster automatically.

Step 3: Create Avi namespace and label workload cluster for automated AKO installation.

Ako Pod List

 

# kubectl create ns avi-system --kubeconfig=/root/wld01-kubeconfig.yaml

 
# kubectl label cluster <workload-cluster-name> -n <namespace> <key>=<value>

 

 
example: kubectl label cluster tkg13-wld01 -n wld01 location=haas-lab

As soon as a matching label is provided to a workload cluster, you will see the creation of an AKO pod in the avi-system namespace.

# kubectl get all -n avi-system --kubeconfig=/root/wld01-kubeconfig.yaml

 
NAME        READY   STATUS    RESTARTS   AGE
pod/ako-0   1/1     Running   0          29h

 
NAME                   READY   AGE
statefulset.apps/ako   1/1     29h

Deploy Sample Application

Now that you have got the AKO pod deployed for the workload cluster, it’s time to deploy a sample app of the type load balancer and verify that it creates objects (VS, VIP, Pool, etc) in NSX ALB and you can access the application.

A sample yaml is shown below for deploying a 'Yelb' application.

yelb.yaml  Expand source

apiVersion: v1
kind: Service
metadata:
  name: redis-server
  labels:
    app: redis-server
    tier: cache
  namespace: yelb
spec:
  type: ClusterIP
  ports:
  - port: 6379
  selector:
    app: redis-server
    tier: cache
---
apiVersion: v1
kind: Service
metadata:
  name: yelb-db
  labels:
    app: yelb-db
    tier: backenddb
  namespace: yelb
spec:
  type: ClusterIP
  ports:
  - port: 5432
  selector:
    app: yelb-db
    tier: backenddb
---
apiVersion: v1
kind: Service
metadata:
  name: yelb-appserver
  labels:
    app: yelb-appserver
    tier: middletier
  namespace: yelb
spec:
  type: ClusterIP
  ports:
  - port: 4567
  selector:
    app: yelb-appserver
    tier: middletier
---
apiVersion: v1
kind: Service
metadata:
  name: yelb-ui
  labels:
    app: yelb-ui
    tier: frontend
  namespace: yelb
spec:
  type: LoadBalancer
  ports:
  - port: 80
    protocol: TCP
    targetPort: 80
  selector:
    app: yelb-ui
    tier: frontend
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: yelb-ui
  namespace: yelb
spec:
  selector:
    matchLabels:
      app: yelb-ui
  replicas: 1
  template:
    metadata:
      labels:
        app: yelb-ui
        tier: frontend
    spec:
      containers:
      - name: yelb-ui
        image: docker.io/yelb/yelb-ui:v1
        ports:
        - containerPort: 80
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: redis-server
  namespace: yelb
spec:
  selector:
    matchLabels:
      app: redis-server
  replicas: 1
  template:
    metadata:
      labels:
        app: redis-server
        tier: cache
    spec:
      containers:
      - name: redis-server
        image: docker.io/yelb/yelb-redis:v1
        ports:
        - containerPort: 6379
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: yelb-db
  namespace: yelb
spec:
  selector:
    matchLabels:
      app: yelb-db
  replicas: 1
  template:
    metadata:
      labels:
        app: yelb-db
        tier: backenddb
    spec:
      containers:
      - name: yelb-db
        image: docker.io/yelb/yelb-db:v1
        ports:
        - containerPort: 5432
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: yelb-appserver
  namespace: yelb
spec:
  selector:
    matchLabels:
      app: yelb-appserver
  replicas: 1
  template:
    metadata:
      labels:
        app: yelb-appserver
        tier: middletier
    spec:
      containers:
      - name: yelb-appserver
        image: docker.io/yelb/yelb-appserver:v1
        ports:
        - containerPort: 4567

To deploy the application run the command: # kubectl create -f yelb.yaml

Listing the pods in the yelb namespace returns the status of pods that have been created as part of the yelb application deployment.

Yelb Pod List

# kubectl get pods -n yelb --kubeconfig=/root/wld01-kubeconfig.yaml
NAME                              READY   STATUS    RESTARTS   AGE
redis-server-576b9667ff-52btx     1/1     Running   0          8h
yelb-appserver-7f784ccd64-vtxlx   1/1     Running   0          8h
yelb-db-7cdddcff5-km67v           1/1     Running   0          8h
yelb-ui-f6b557d47-v772q           1/1     Running   0          8h

 

Verify NSX ALB Objects

Login to NSX ALB and verify that a VS and VIP have been created for the yelb application.

Virtual Service

VIP

Server Pool

 

 

On hitting the VIP created for the yelb application, verify that you can see the Yelb dashboard.

 

Author and Contributors

Manish Jha has authored this article.

 

 

 

 

Filter Tags

General Operations and Management Kubernetes Management SDDC Tanzu VMware Cloud on AWS Document Technical Guide Technical Overview Intermediate Deploy