VMware Cloud on AWS: Network Security
Understanding the network security model of an SDDC is a critical part of designing and managing a VMware Cloud on AWS solution. While this model isn’t extremely complicated, there are a few concepts which may be unfamiliar to individuals who are only accustomed to managing traditional hardware firewalls.
The following sections are designed to provide the fundamental knowledge required to successfully design and deploy a network security policy for an SDDC.
The Network Security Model
Network security within an SDDC is enforced by 2 types of firewalls: the NSX gateway firewall and the NSX distributed firewall (DFW).
These firewalls are designed to address network security policy enforcement in 2 different ways, with the gateway firewalls enforcing policy at the network border and DFW enforcing policy within the compute network of the SDDC. Keep the following points in mind with the NSX firewalls:
- As with most firewalls, NSX firewall rules are evaluated top-to-bottom with the first matching rule being applied to a new connection.
- Firewall policy is applied bidirectionally. In other words, new network connections will be evaluated against the ruleset both ingress to the SDDC and egress from the SDDC.
- The firewalls are stateful. This means that if a request is permitted through the firewall policy, then the response is automatically permitted through.
- NSX uses group and service definitions as part of firewall rule creation.
Network security within an SDDC is configured from the Network & Security tab of the SDDC view from within the VMC console. The following sections will provide more details on the concepts used within each of the firewalls.
A service definition may be thought of as collections of 1 or more protocols (IP, ICMP, UDP, TCP, etc…) along with their associated ports. Although many of the standard service definitions have been pre-created within the SDDC, it is sometimes necessary or convenient to create custom definitions.
Groups represent 2 classes of resources:
- VMs within a given network of the SDDC
- IP addresses that are external to a given network within the SDDC
Unlike services, which are defined globally within the SDDC, groups are scoped per the management and compute networks.
The Gateway Firewalls
Since the gateway firewalls are enforced on the NSX edge routers, they may be thought of as behaving like traditional centralized firewalls. Within the SDDC, they are responsible for enforcing network security policy at the border of their respective networks. It is important to understand the following points regarding the gateway firewalls:
- The default security policies of the gateway firewalls are that of “default deny”.
- There are 2 points of enforcement for the gateway firewalls: on the tier-0 edge router and on the MGW.
Beginning with the first point, the “default deny” policy requires the security administrator to explicitly define traffic which should be permitted through the firewalls. Keeping in mind that the firewall rule set applies bidirectionally, the security administrator must define permitted traffic both ingress and egress. Since the firewalls are stateful, responses for permitted traffic will always be permitted.
Regarding the second point, it is important to understand how and why the gateway firewalls are applied as they are.
The entry point for the entire SDDC is the tier-0 edge router. This is the first point of enforcement for the gateway firewall, and as a result this edge device had the potential to protect all networks within the SDDC. However, a design decision was made to exclude the management network of the SDDC from the gateway firewall of the tier-0 edge. This means that the gateway firewall of the tier-0 edge only protects the compute networks of the SDDC. This design decision makes sense when you consider that the MGW router (which borders the management network) has its own gateway firewall. Since the MGW is already protecting the management network, protection of the management network at the tier-0 edge would be redundant and confusing, and was thus removed.
This begs the question of “Why enable the gateway firewall on the MGW?". Since it is desirable to protect the management network from not only the external world but also from the compute network, enforcement of security policy at the border of the management network is necessary. By enabling the gateway firewall on the MGW, the security administrator has the ability to protect the management network from both external and internal (compute) networks.
Although it is good to understand the how and why behind the design of the gateway firewalls, these details aren’t specifically necessary in order to effectively manage security policy. From the perspective of the end-user, the VMC console abstracts these details away and presents a simplified view which displays a pair of rulesets: one labeled Management Gateway and another labeled Compute Gateway. This UI layout is designed to focus the security administrator on the policies themselves rather than drawing attention to where and how they are enforced.
The Distributed Firewall
The distributed firewall (DFW) represents a break from traditional, centralized firewalls in that it is not enforced in a central location (as with the gateway firewalls) but is enforced at the vNIC of each VM in the compute network. By enforcing security policy at the absolute edge of the network, it becomes possible to manage network security policy in ways which are difficult to replicate in a traditional data center network.
The distributed firewall is available from the Network & Security tab of the SDDC view within the VMC console.
The ruleset for DFW is organized around the concept of sections. As highlighted in point 1 above, there are 4 pre-created sections for DFW. These are conveniences that have been added to the UI in order to direct the security administrator into the good practice of organizing the rules of the security policy. The key thing to remember is that sections are an organizational tool only. Rules within NSX firewalls are evaluated top-to-bottom independently of sections. This means that rules in the Emergency Rules section will be evaluated before rules in the sections below it (and so on). Keep this in mind particularly when creating “deny” or “reject” rules.
Point 2 highlighted above calls out the default security policy of DFW. There are behaviors possible.
- Blacklist (default option) – This option applies a “default permit” policy to the firewall. This means that traffic is permitted unless specifically blocked (or blacklisted) by a “deny” rule.
- Whitelist – This option applies a “default deny” policy to the firewall. This means that all traffic is denied unless specifically allowed (or whitelisted) by a “permit” rule.
- DFW defaults to Blacklist behavior, meaning it is effectively disabled by default.
- DFW is applied only to the compute network.
- DFW may filter network traffic both north-south and east-west.
In order to prevent confusion with the dual layers of gateway and distributed firewalls, the default security policy of DFW is set such that it is effectively disabled. To implement DFW, the security administrator must specifically construct “drop” or “reject” rules.
The purpose of DFW is to enable the security administrator to construct security policy that may be applied within the compute network itself. As an example, a typical use of DFW is to provide security between the tiers of a multi-tiered application, or to provide security between separate tenants within the SDDC. DFW is unique in that it is applied at the absolute edge of the network. This feature effectively decouples network security from network architecture. To illustrate this point, imagine a traditional data center network. Typically, if security was needed between tenants or between tiers of an application, then network architecture would be designed to reflect this separation (i.e. a VLAN per tenant or application tier). In this model, if security requirements changed then the network infrastructure would need to be altered and workloads migrated and re-IPed. With DFW, security policy is agnostic of network architecture and may be enforced regardless of VM placement within the SDDC. Due to this decoupling, it is possible to provide security between tenants or application tiers even when the workloads reside within the same subnet. If security policy changes, then workload migrations and IP changes are not necessarily required.
The Cross-Linked VPC
Since network security between the SDDC and the cross-linked customer VPC is managed in multiple places, it is worth specifically calling it out as a stand-alone topic. Specifically, the security administrator must consider all of the points where security policy may be enforced for traffic between the SDDC and that VPC:
- DFW - Security policy defined by DFW would be enforced at the vNIC level of all VMs within the compute network.
- Gateway Firewall - Management Gateway policies would affect connectivity to/from the management network, and Compute Gateway policies would affect connectivity to/from the compute network.
- AWS Security Groups - The security groups of the VPC itself will impact connectivity to/from the VPC.
The policies of the gateway firewalls and DFW have already been discussed, so this section will focus on Security Groups within the cross-linked VPC itself. As discussed in the SDDC network architecture document, the cross-linking to the customer-owned VPC is enabled via a series of cross-account ENIs that are created within a subnet of that VPC and are for use by the hosts of the SDDC. As part of this setup, these ENIs are configured to utilize the default Security Group of the VPC. It is important to keep the following points in mind with Security Groups within the cross-linked VPC:
- Direction - As mentioned, the cross-account ENIs used by the SDDC have the default Security Group applied to them. It is important to visualize this setup and remember that “inbound” rules of the Security Group apply to traffic from the VPC toward the SDDC, and that “outbound” rules apply to traffic from the SDDC toward the VPC.
- Default Rules - By default, Security Groups are configured to permit all traffic outbound. Per the previous callout, this means that the SDDC can initiate connections to workloads within the VPC. Inbound, there is also a default rule which permits members of the same Security Group to communicate with each other. This means that any workloads which have the default Security Group applied to them may initiate connections to the SDDC.
- Non-Default Security Groups - Often times, other services within the VPC are utilizing custom Security Groups. In these cases, the security administrator may need to modify both the customer Security Group as well as the default Security Group in order to ensure that the required connectivity is permitted.
Network Address Translation (NAT) services are provided to the SDDC by the tier-0 edge. For outbound requests, by default, the workloads of the compute network will utilize a dedicated NAT IP which exists on the internet uplink of the tier-0 edge. This IP is visible from the Overview section of the Network & Security tab of the SDDC view in the VMC console.
The following are some basic recommendations for working with network security within an SDDC.
There are a great many pre-created service definitions within the SDDC. However, it sometimes makes sense to create custom definitions for custom applications. Consider creating a single service definition which encapsulates a given function of a custom application. For example, if an application utilizes a pair of TCP ports then define both ports as part of the service definition.
There are a few different options available for group definitions (with additional new ones on the roadmap) within the SDDC. When creating groups, keep the following in mind:
- Anything which is external to a given network within the SDDC (management or compute) may only be referenced by IP. Utilize summary addresses as much as possible when defining IP-based groups.
- Anything native to a given network within the SDDC may be referenced by higher-level constructs such as VM name or security tag. Utilize these constructs as much as possible. Doing so will make your security policies more resilient to network changes within the SDDC.
- Security tags provide an excellent tool for defining security policy. Put some serious thought into standardizing your tagging scheme. A common approach is to assume a “Lego brick” model for tags: small, atomic tags which may be combined to effectively classify a workload. However, keep the maximum tags-per-VM limit in mind when designing your scheme.
DFW utilizes sections as a means of organizing firewall rules. Consider organizing sections around your specific business requirements. For example, create a section per application or per business unit. Always remember that sections are only a means of organizing rules; rules are evaluated top-to-bottom independently of their parent section.
Porting Existing Rulesets
If your organization is like most, then you have accumulated several years’ worth of “cruft” in your existing security policies. This tends to happen with IP-based rulesets. The rules quickly become complex, and security administrators are afraid to remove anything for fear of breaking something.
Avoid the temptation to port existing rulesets to the SDDC. You have a unique opportunity to rework your policies based on higher-level grouping constructs which will very likely simplify your security policy. Don’t miss this opportunity.
Authors and Contributors
Author: Dustin Spinhirne