Thursday, July 11, 2019

Palo Alto Firewall VM Series - Integration and Evaluation - Part IV

Okay this is probably a final post on the Palo Alto Firewall VM Series Integration project.

This project was aimed at doing a proof of concept reference implementation of a fully integrated and provisioned SD-WAN, using a Palo-Alto Virtual Firewall to protect the edges of two campus networks (e.g. East and West).

It uses a Service Orchestration software platform (Service Orchestrator) to model and specify services (networks) that the Orchestrator will deploy, out to the field, to the CPE, at the push of a button.

What are we deploying?

We are deploying overlay networks (SD-WAN) to Customer Premise Edge (CPE) devices. These devices are typically deployed smack on the edge of a premise network, and for that reason have optical SFP termination ports, and Gigabit (or 10 Gig in some cases) ports.

Now to make these overlay networks work in these modern times, we rely on Virtualization. Virtual Machines, Containers (i.e. Docker), OpenStack, et al. (depending on the CPE and software chosen).

Network Function Virtualization (NFV) - meaning non-iron-based network functions like Routing, Switching, Network Stacks in Namespaces, and Overlay networks (i.e. secure overlay) - ties all of this together. NFV is a fairly recent technology and has made networking (as if it weren't complicated enough already) a bit MORE complicated. But virtualizing applications and processes without also virtualizing the networking the rely on to intercommunicate doesn't achieve the objectives that everyone is after by using virtualization in the first place. It is like going halfway and stopping.

So - we have been deploying this new VM-Series Palo Alto Firewall in a network "topology" in an automated fashion, using the Palo Alto API to provision all of the "things that need to be provisioned", a few of which are mentioned below.

  • Interfaces
  • Zones
  • Security Policies
  • Static Routes
  • Policy-Based Forwarding Rules

This is done, by using the Palo-Alto XML API.

So to do this, at least in this proof of concept, we provision:
  1. A Palo-Alto Firewall
  2. A virtual machine (temporary) that provisions the Palo-Alto Firewall
  3. A SD-WAN Gateway behind the Palo-Alto Firewall
  4. A client or server behind the Gateway.
Topology Provisioned on a CPE device - Site West
screenshot: ADVA Ensemble Orchestrator

Sounds easy, right? It's not. Here are just a few of the challenges you face:
  • Timing and Order
    • The Firewall needs to be deployed first, naturally. Otherwise you cannot provision it.
    • The Firewall Provisioner configures all of the interfaces, routes, zones, policies and rules. This is done on a management network - separate from the service plane, or traffic plane as it is often referred to.
    • With a properly configured firewall in place, the Gateway can make outbound calls to the Cloud.
    • Once the Gateway is up and cloud-connected, the server can begin listening for and responding to requests.
  • Routing
    • Dynamic routing stacks are difficult to understand, set up and provision; especially BGP. Trying to do all of this "auto-magically" at the push of a button is even more challenging.
    • In our case, we used static routes, and it is important to know how these routes need to be established. 
  • Rules
    • Indeed, it is sometimes difficult to understand why traffic is not flowing properly, or at all. Often is due to routing. It can also be due to not having proper rules configured.
  • Programmatic APIs
    • Making API calls is software development. So the code has to be written. It has to be tested. And re-tested. And re-tested.
When you deploy a network push-button, the visual ease at which it all happens starts to make people take it for granted, and NOT fully appreciate all of the complexity and under-the-hood inner workings that make it all come to light.

Most of the network troubleshooting (or all of it perhaps) had to do with missing or incorrect routes, or missing rules. It can be a chicken or egg problem trying to figure out which one is a culprit.

In this particular CPE, it runs:
  • an OS (Linux)
  • two docker containers - that contain OpenStack nodes (Controller node and Compute node)
  • a FastPath Data Switch - that map physical ports to virtual ports 

The "magic" is in the automation and integration. The data switch is pre-provisioned and custom developed for performance. The integration with OpenStack allows virtual ports and virtual switch services to be created / removed as new Virtualized Network Functions (VNFs) are initialized or terminated on demand.

Then, on top of this, you have the VNFs themselves; in this case here, the Palo Alto Firewall, the Edge Gateway behind this firewall, and a Server on a trusted network behind the gateway.

Traffic from the server will run through the firewall for inspection before it gets to the Gateway so that the firewall can inspect that traffic before the gateway encrypts it for outbound transport.

Then, the Gateway will encrypt and send the traffic back through the firewall, and out the door, on a virtual circuit (or multiple virtual circuits if multi-pathing is used), to the complementary gateway on the other side (think East to West across two corporate campus networks here).


No comments:

Fixing Clustering and Disk Issues on an N+1 Morpheus CMP Cluster

I had performed an upgrade on Morpheus which I thought was fairly successful. I had some issues doing this upgrade on CentOS 7 because it wa...