AWS VM Series GWLB with Overlay routing - outbound and inbound

patoil · ‎05-07-2024

Hi,

I’m using 2 active VM series firewalls for outbound sessions with overlay routing, with GWLB and Transit Gateway (TGW) between the Application VPC and the Security (firewall) VPC. This is working as expected.
Inbound connections fail to establish.

The need for overlay routing is for managing NAT and VPNs on the firewall.

An external (3rd party SAAS) load balancer is used for inbound sessions.

Just one GWLBe is being used inside the Security VPC, for GENEVE encapsulation into the firewall.
I can verify the inbound session attempt mapped to the trusted application subinterface. This is the first (Syn) packet.
I also have pcaps in the application server showing syn-ack (reply) towards the firewall’s trusted IP (the fw is using snat for inbound sessions).
The firewall receives the syn-ack and drops it with “no matching session” reason.

Did anyone have luck with this setup?

Does Overlay routing support inbound and outbound sessions?

fbee-pan · ‎05-07-2024

Afaik Overlay Routing is not needed for inbound inspection.

You just have to place another GWLBe after the inbound LB (without overlay routung needed).

Overlay Routing is to save the cost for a NAT GW (outbound)

patoil · ‎05-08-2024

Thanks for your reply.
I understand that’s a good benefit and I also need to have a public IP on each firewall for Inbound NAT and VPNs (L2L and GP). This would save a couple of additional firewalls to afford and manage.
In addition, my security team would like to have inter-zone instead of intra-zone policies, that’s why we are testing this scenario.
Anyone knows if it’s possible?

glynn · ‎05-08-2024

If I understand what you are attempting to accomplish, it should be doable although I have not tested it. From what you describe, the desired traffic flow would be:

client -> IGW -> GWLBe -> GWLB -> FW (SNAT) -> server -> (SNAT) FW -> GWLB -> GWLBe -> IGW -> client

Since you also have outbound as well, routing could be interesting as the response traffic needs to go back to the FW interface and not the GWLBe (the default for outbound traffic).

Another option would be to configure a couple of additional subnets and use dedicated FW interfaces (non-GWLB) and just handle the inbound traffic completely separately (including separate VRs) from the outbound traffic. I have tested this and it does work but takes a bit of effort to set up.

It might also be worth contacting your sales team and having them engage a SME to help you understand your options and the best path forward.

patoil · ‎05-08-2024

Yes Glynn. That's what I'm trying to do! Your second option: dedicated FW interface (E1/2) non GWLB.

I'm attaching a detailed diagram.

I'm very familiar with multiple VRs, but I don't see the need to do so. I have a single VR with default + private routes and routing looks good. In fact, I have 2-way traffic for outbound connections so I think I don't need to change single to double VRs.

I think the issue might be related to Inbound connections routing out of the Geneve Subinterface and returning from the server without encapsulation (only for Inbound sessions).

For outbound I route 0.0.0.0/0 towards a GWLBe to get packets encapsulated and delivered to the firewall.

For inbound, I verify the firewall's private interface encapsulates using the right subinterface (that's what I need) and I think the response towards the firewall's private (snat) IP might arrive without encapsulation, so I have "no matching session" drops in the global counters.

I tried adding a route to the firewalls private subnet towards the GWLBe but it didn't make any change. That's why I'm not sure if this is a valid scenario, but I'm so close to have the desired architecture running..

I really appreciate all interactions. All my assumptions might be wrong!

glynn · ‎05-08-2024

Ak, ok. If that is the inbound path, then it will not work. The traffic has to hit the GWLB before going to the FW for the sessions to be set up and tracked correctly. What you need to do is take the inbound traffic coming off of the IGW and route it to the GWLBe and then through the FW out e1/2 to the server. I would guess you are seeing asymmetric traffic due to one side being GENEVE encapsulated whilst the other is not. Hence the FW is not happy.

for the multi-interface scenario I mentioned previously, I recall that the routing was easier with separate VRs as I could ensure better separation of the traffic.

patoil · ‎05-08-2024

If I do that I’d have to invert the firewall’s routes or at least move the private route to the untrusted interface, E1/2.

Currently, the route to private networks is via E1/1.

I think I’ll also have problems to set up vpns (GP and L2L) in E1/2, don’t you think?

glynn · ‎05-08-2024

Yes, the tunnels terminating on the FW will have the same problems that the inbound traffic from the internet does.

Since you are using the TGW, I would suggest terminating the VPNs on that and just treat it as a spoke (or spokes).

If interzone policies are a hard requirement, setting up an additional pair of FW interfaces outside of the GWLB is probably going to be easier all the way around.

patoil · ‎05-08-2024

Thanks again Glynn, but terminating the VPNs through the TGW would be a non Overlay routing scenario.

What I'm trying to achieve is interzone policies and inbound VPN & NAT using the public interface.

Do you think this is possible?

glynn · ‎05-08-2024

What you are looking to achieve is possible; however, I do not think you can do it with the GWLB due to the interzone requirement unless you do some unusual things. I think using separate interfaces separate from the existing GWLB+FW construct for the inbound traffic and VPN termination would be your best bet; however, it would be worth having your PANW sales team engage a SME to review the goals and see what other options might be available.

patoil · ‎05-08-2024

I think we (you) are making good progress!

What do you think about simulating the two firewall model for inbound and outbound by using 2 VRs?

VR1 for Outbound is already working.

New VR2 with 2 new non GWLB interfaces for inbound (GP + public services access). Snat to return through the same firewall.
I’d rather use 2 vsys if possible, but I think VRs might do the trick.

Now I’m thinking this is what you meant in when you first brought the VRs to the conversation, is it?

I’d have to give it a try.

glynn · ‎05-08-2024

Précisément. Not long after the GWLB went GA, I encountered a customer that was attempting to maximize their investment in VM-Series by using it for inbound connectivity in addition to the GWLB. I set it up using 2 VRs (one for GWLB, one for inbound). On the inbound side, we used a load balancer but it is not a requirement. The use of SNAT should ensure symmetric return. 2 VRs made life easier (routing, troubleshooting, etc.). It took a bit to set up and the diagram looked a little funny but it worked.

As far as I am aware, the VM-Series does not presently support multiple VSYS.

patoil · ‎05-11-2024

Update!

I found a working setup using a separate interface for inbound connections. The incoming interface (public) can be shared with the outbound sessions. A separate internal interface is required to avoid Geneve encapsulation for inbound sessions.

Summary of AWS NICs:

nic 0: Private for outbound sessions from multiple VPCs (Geneve Endpoints mapped to subinterfaces)

nic 1: management (interface swap). Remember that Geneve can only be mapped to Nic 0.

nic 2: public with associated ENI

nic 3: private for inbound connections. Internal routes go here

This setup requires a VM supporting 4 NICs.
Since m5.large (vm100) doesn’t allow 4 NICs, at least m5.xlarge (vm300) is needed, but this image requires a “bigger” credits license from Palo Alto BYOL, and more expensive VM resources running in AWS. Cost is x2 for AWS opex and PAN credits.

The PAN Deployment guide recommends to use a pair of firewalls for inbound and another pair for outbound.
It looks like there’s no workaround to use a pair of vm100, or it’s equivalent 2 cpu credits based VM, for inbound and outbound sessions with overlay routing due to AWS GWLB characteristics.

glynn · ‎05-13-2024

Awesome. Did you have to use a separate VR?

Also, you might try changing the instance type and leaving the model at VM-100. IIRC, the instance size is what determines the max number of interfaces, not the VM-Series model so you should be able to have a VM-100 with 8 interfaces. You would not be able to take advantage of the capacity increase but if all you need is interface density, you should be fine.

patoil · ‎05-13-2024

VR are not needed. You can just use a single VR to be able to share the public interface. Otherwise you'll need an additional interface.

Using a 2 CPU firewall license on a 4 CPU VM might serve as a workaround to enable the use of additional interfaces and address the architectural challenges with GWLB.

It's frustrating when AWS platform constraints result in increased operational costs.

AWS VM Series GWLB with Overlay routing - outbound and inbound