Well-Architected Design: Approaches to Distributed Firewalls

Introduction

When planning and deploying a VMware Cloud solution leveraging the built-in security capabilities, such as the Distributed Firewall (DFW), there are many considerations to keep in mind. While the DFW provides an extensive set of capabilities to implement zero trust and granular micro-segmentation, its adoption, especially for an existing or migrated environment, can prove challenging.

Successful integration of a new security approach and specific features into an organization should be done in phases. Making sure the existing security policies are maintained before adopting new capabilities is imperative to avoid a complete overhaul.

Infrastructure Overview

The first step required to properly implement the distributed firewall is to understand where it functions in the inspection flow:

Figure 1 – Firewall Locations

As shown in Figure 1 – Firewall Locations, the Gateway Firewall is used to inspect all North-South traffic going in and out of the T1 Gateway uplink interface. The distributed firewall is inspecting all East-West traffic for the segments connected to the T1 gateway.

The North-South traffic can be internet outbound, or traffic coming from other T1 gateways in the environment. East-West traffic can be any traffic within a specific segment or even between different segments connected to the same T1 Gateway.

As depicted in Figure 2 - East West Traffic Flow, the traffic between two VMs, even if they are in different segments, is not inspected by the gateway firewall. To secure and isolate the connection between these two segments the DFW should be used.

Figure 2 - East West Traffic Flow

Understanding the role of each firewall can help map the requirements of specific organizational policies to the appropriate firewall. In most cases, rules created at the gateway firewall level should be valid for all the workloads and segments connected to it. Rules created at the distributed firewall can be more granular and only apply to a certain group of segments or workloads.

Operations Overview

Once the role of the distributed firewall in the SDDC security stack is understood, the next step is to decide how to operationalize it. Planning how the different security controls will be implemented is significant for a successful and optimized adoption of the distributed firewall.

Identifying Objects and Workloads

To make sure the security policies that will be implemented are applied to the correct workloads it is important to help identify the individual objects connected to the NSX infrastructure, such as VMs, Groups, and Segments. In an SDDC environment, NSX Tags are used to help with identification.

An NSX tag is comprised of a Tag and a Scope. Each are fully customizable and can be aligned to your organization structure. For example, a segment can be tagged with “Scope=Environment, Tag=Prod”. An object can have multiple tags, making it easier to add identifiers. The same segment that was used in the previous example can also have a tag of “Scope=BU, Tag=Marketing” to indicate it is a segment containing production workloads used by the marketing department.

In any environment, the tagging strategy should align with the requirements of the implementation. If the current phase is to define DFW rules based on segments, spending time and resources identifying and tagging all the VMs might not be valuable.

Grouping Objects and Workloads

DFW rules are created leveraging Source and Destination fields per rule. The fields can contain IP Addresses (ranges, CIDRs, or individual IPs), groups, or a combination of both. Keep in mind that leveraging IP addresses is not as easy to manage at scale. NSX Groups provide a simpler and more scalable method to creating and maintaining rules.

Each group can contain a combination of objects such as VMs, segments, IP addresses, and even other groups. For example, a group called “Manufacturing-Prod-App” can contain all the VMs that are part of that specific application. DFW rules that will be set to that group will be applied to each individual member VM. If the application scales, the new VM simply needs to be added to the group and the same rule will apply. That group can then be nested within “Tier1-Prod-Apps” along with other application groups, with its own set of less specific application rules that will be propagated accordingly.

Group membership can be assigned statically, dynamically, or a combination of both. Static membership requires an object to be added via an administrator or external automation. Dynamic membership is configured using a set of conditions based on an object’s criteria, such as a name or tag. For example, a group can contain all the segments with the Tag “Prod” and Scope “Environment”.

As stated previously, object grouping should also align to the current implementation phase. Start with creating groups to contain the objects you are currently planning to segment. In addition, leverage group nesting to simplify rule creation in the future. Start with the larger groups, such as “Dev”, “Prod”, and then carve them out to smaller groups based on similar characteristics if a more granular approach is required.

Implementing Rules in the Correct Scope

As the distributed firewall is a software defined firewall running across all the configured hosts in the SDDC, the rules are propagated to each host and vNIC. When more rules are added, due to growth or additional layers of security, the number of propagated rules may result in a large rule table which can impact performance. In addition, a misconfiguration of a rule can propagate to the entire environment resulting in accidentally opening or closing access to workloads.

A simple way to avoid these issues is to leverage the Applied-To feature within a rule. This feature will define the scope to which the rule will be published. The options are either the entire DFW or one or more groups. Selecting a group will guarantee that the rule will only be propagated to that group, which optimizes the rule table distribution, but also limits the scope the rule will affect.

Configuring the Applied-To functionality can simulate a zone-based firewall in the network. For example, rules for environments such as test and production can be changed without one impacting the other. Also, the security rules of one application or small set of VMs can be tightened as a proof-of-concept or early adoption of a zero-trust model while still maintaining the same security model across the rest of the network, allowing for gradual adoption.

One other use for this feature is DMZ Anywhere. Creating a DMZ ruleset and that is only propagated to the DMZ workloads allows deploying to all workloads on the same compute resources without the risk of exposing the internal workloads.

Figure 3 - Applied-To Rules

The Applied-To rules should be used whenever possible to avoid future re-work. Propagate rules to the entire DFW only in case of an emergency, troubleshooting, or in an all-encompassing rule that should be placed on the entire network as part of a security policy.