Active/Active L3 problem with asymmetric routing and NAT

cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Announcements

Active/Active L3 problem with asymmetric routing and NAT

L2 Linker

I'm stumped.  I've looked through as many pieces of documentation and discussions as I can find and I think I have everything set up correctly, but it's only half working.

 

What I have is two PA-5050s in Active/Active.  I have two routers on the outside, each has a L3 connection to both firewalls.  I have two routers on the inside, each has a L3 connection to both firewalls.  On the inside both firewalls are using OSPF to both routers and have full route tables containing all internal routes.  On the outside right now I have static routes set up, to point NAT addresses back at the firewalls.  I have all of my static NAT address ranges pointed to both firewalls on both outside routers using ECMP.  I have the dynamic NAT ranges pointed to just the firewall that owns them.  Internal to each firewall I have BGP running between the outside zone and the inside zone for route sharing.  Looking at routing tables in both outside routers, both inside routers, and both PA firewalls, everything is 100% complete.  I don't think this is the problem.

 

For dynamic source NAT, I split my pools in half and gave half to each firewall.  Any internal hosts that receive a dynamic NAT when accessing the internet works 100% fine.

 

The problem comes when I want to have a static NAT for a public server behind the firewall.  Right now I have 2 servers for testing.  One of them works fine, the other does not.  I've used wireshark on the outside routers and for the working server I see traffic leave and re-enter the same firewall.  For the one that does not work, I see the traffic leave one firewall and enter through the other.  I see the SYN packet leave one firewall destined for the internet, the return SYN-ACK gets sent to the other firewall and it dies there.  The internal server never sees it.

 

I've checked session tables on both firewalls and both contain the session information, although one says "active" and the other says "init". Route tables look good.  I thought that with active/active returning traffic could enter the other firewall and since it knows about the session it would go ahead and just route it, or even send it over the HA3 link to the session owner, but best I can tell neither of those is happening.

 

Does anyone have any idea where I've gone wrong?  Am I misunderstanding something somewhere?  Does any of this make sense? 😉

10 REPLIES 10

L4 Transporter

Hello howardtopher,

 

Does this server have a public address or are you using destination NAT? I ask because you said static NAT, which makes me think of static IP source NAT.

 

If you have a destination NAT rule configured, have you made sure that the Active/Active HA binding is set to both?

 

thanks,

Ben

 

 

 

The server has an internal IP (ie 10.11.12.13) and on the external side of the firewall it has a publicly routable IP (ie 100.101.102.103).  I have set up a source NAT, set it to bi-directional, and have the binding set to both.  I also tried removing the bi-directional part and setting up my own destination NAT rule for it with binding set to both and it still didn't work.  I'm making the assumption that bi-directional just means it automatically creates the destination NAT rule for you.

L7 Applicator

You seem to have all the concepts correct.  Yes, Active/Active is suppose to allow this asymmetrical traffic.  

 

I don't have access to one now, but my recollection in A/A is that nat pools get tied to a node when you create them.  Thus you might be needing to create those nat pools and matching rules on the second node.  I seem to remember I needed this duplication for the nat to correctly in these scenarios.

Steve Puluka BSEET - IP Architect - DQE Communications (Metro Ethernet/ISP)
ACE PanOS 6; ACE PanOS 7; ASE 3.0; PSE 7.0 Foundations & Associate in Platform; Cyber Security; Data Center

From what I've seen in trying things and read so far is if you are using pools then you have to split the pool and give half to each node.  I've done that and that seems to work great so far.  The only caveat to that is that you have to make sure you route the correct half to the correct node. Basically, if I have a /24 pool, I split that into two /25s and give the lower half to node 0 and the upper half to node 1.  And then on my external routers make sure that the lower /25 is routed to only node 0 and not node 1 and visa versa. If I could put the whole pool on each node so that traffic can come back in asymmetrically that would be great, but I haven't been able to get that to work.  However, I can live with that split.

 

The problem I'm having is for addresses outside of the pool that get used in a 1-1 type fashion.  Internal host x always has address y on the outside and I can allow external hosts access to services hosted on that internal host.  The asymmetrical traffic seems to be broken somewhere.  I've dug through logs and have not seen anything show up.  I have the nat rule binding set to both.

When we had this issue, we duplicated the pools creating one on both nodes so that the nat would work.  If I remember correctly you cannot use bind both in A/A.  So you essential double the number of pools and rules created in this scenario.

Steve Puluka BSEET - IP Architect - DQE Communications (Metro Ethernet/ISP)
ACE PanOS 6; ACE PanOS 7; ASE 3.0; PSE 7.0 Foundations & Associate in Platform; Cyber Security; Data Center

I tried duplicating pools for each firewall and I'm still having the same results.  If a SYN leaves one firewall and the SYNACK comes back into the other firewall the SYNACK never makes it back to the originating host.  However, if the SYNACK comes back to the same firewall the SYN left from it's all good.

These are a new install so I started out with PAN-OS 7.0.1.  At this point I'm almost curious if I need to go back to PAN-OS 6 and try the same config and see if the problem persists there.  I've opened a ticket with support so maybe they can figure out what's going on.

My experience was with both 5 and later upgraded to 6.  Would be interesting to see if the issue is 7 specific.

 

Let us know what support says.

Steve Puluka BSEET - IP Architect - DQE Communications (Metro Ethernet/ISP)
ACE PanOS 6; ACE PanOS 7; ASE 3.0; PSE 7.0 Foundations & Associate in Platform; Cyber Security; Data Center

It took a while, but support eventually agreed there was a problem.  While they take a look at all the files and packet captures they took, I tried a downgrade to 6.1.6.  My configuration works perfectly there.  So this issue is definitely v7 related.

Looks like something for 7.0.2...err, not likely too close to release.  7.0.3 here we come.

  • 6260 Views
  • 10 replies
  • 0 Likes
Like what you see?

Show your appreciation!

Click Like if a post is helpful to you or if you just want to show your support.

Click Accept as Solution to acknowledge that the answer to your question has been provided.

The button appears next to the replies on topics you’ve started. The member who gave the solution and all future visitors to this topic will appreciate it!

These simple actions take just seconds of your time, but go a long way in showing appreciation for community members and the LIVEcommunity as a whole!

The LIVEcommunity thanks you for your participation!