We are doing linking monitoring and are failing over the firewall if the inside interface goes down. The tentative firewall is sending unencrypted traffic like http and pings dns etc over HA3 to the the peer and that traffic is processing fine but it does not send the encrypted traffic like ssl and ssh to the peer over HA3 and it trying to process the packets and then dropping the packets as the inside interface is down. Has anyone seen this behavior??
Your description of the issue is not clear. I am assuming this is an active-active setup because you mentioned HA-3 link. HA-3 link would only be used if session setup load sharing option decides that the peer device is responsible for session setup. If it the load sharing option decides that the session setup has to be done on the local firewall, then no data is transmitted through HA-3 link to the peer device. There are important bits and pieces of information missing in your description of the issue. I would suggest you to open a ticket with Support and have the behavior investigated.
Sorry, for putting a question out there with not a lot of detail:
So here are the details:
We have active/active HA cluster
We are monitoring the inside interface via link-monitoring feature and asking the firewall to failover when the interface goes down.
When the inside interface goes down on any one of the nodes that firewall goes in tentative state and that is what we want.
We are only concerned about the new sessions not the existing sessions. We have all new sessions for traffic coming from inside going to outside zone starting at the active firewall so the active firewall is the session owner and session setup firewall. Also we are forcing asymmetric return traffic so it comes back over the outside interface of the tentative firewall.
The http, ping telnet all non encrypted traffic that is returning on the tentative firewall goes over HA3 to the active firewall and then to inside and we are good.
We have issues with ssl and ssh (encrypted traffic), this traffic when it comes to the tentative firewall it does not go over the HA3 link to the active firewall. The tentative firewall offloads the traffic. The tentative firewall handle that this traffic and drops it as the inside interface is down and the only route in the routing table is default 0.0.0.0/0 route and tries to send it to the outside interface and as there is zone change it drops it. We are not decrypting traffic as customer has another device that handles it and does not want Palo Alto to do decryption.
So the question I am posing to the community at large is to ask whether anyone has seen this and more importantly I have question that the expectation that the tentative firewall should not forward packets over any interface other than the HA interfaces is an unreasonable expectation?
I do have an open case with Palo alto Support and I am told that it is by design and it is expected behavior!! I would agree that that this is expected behavior when the firewall is not in tentative state to allow efficient packet forwarding in layer 3 mode but this should not be the expected behavior when the firewall is in tentative state!! What do you guys think??
The explanation given by Palo Alto support is that it is layer 7 complete and therefore it does not send it over the session owner, but my contention is the reason we are failing over and have the firewall in tentative state is that we do not want this firewall to forward any packets over any interfaces other than the HA links.
Just FYI we are just doing static routing on the firewalls that is what works for the customer.
I understand what you are trying to accomplish, but I don't think A/A is the best configuration for this. For A/A to work properly, both firewalls should be able to route traffic to the inside network independent of the other firewall. In your case, you want the secondary firewall to forward all traffic back to the primary firewall via HA3 for processing and routing, and are basically controlling the HA failover via the state of the inside interface. What you are explaining is sort of in-between Active/Active and Active/Passive if that makes any sense... An Active/Inactive, so to speak.
Out of curiosity, what is the reasoning behind the A/A configuration, and the asymmetric return? Why not set up a simple Active/Passive?
Well let me give some more details:
This is right now in test lab to mimic scenario that we saw in production, when we did failure testing for inside interface being down.
The two firewalls sit in two different data centers and there is asymmetric routing and everything works great until we have a switch fail on the inside that is connected by aggregate interface to the inside interface of one of the firewall. This is only in failure scenario.
In the lab we have mimicked the failure scenario:
So we have enabled interface monitoring and get the firewall to go to tentative state when inside interface is down.
We are seeing issue with encrypted traffic in production when we shut the firewall and when there is asymmetric traffic.
To mimic this scenario in the lab we are forcing the return traffic to the outside interface to the tentative firewall.
So the simple question is in this failure scenario the tentative firewall should be forwarding all traffic over HA3 to the active firewall.
The question to the public is how can anyone tell me that a tentative firewall should be forwarding any traffic on interface other than HA link??
You will fail a firewall because you have issues on it.
Again this only in a failed inside interface situation we see the firewall happily sending http and ping and other not encrypted traffic over HA3 so we are good but why does it not send it over the encrypted traffic over HA3, why does it try to process it??
I will agree that this would be fine the interface is up and the firewall is active and we have collected packets to show it does that and the encrypted traffic leaves over the inside interface of the firewall that is not the session owner.
However, this should not be the case when the inside interface is down and the firewall is in tentative state. Nobody can tell me that this should be acceptable behavior.
Click Accept as Solution to acknowledge that the answer to your question has been provided.
The button appears next to the replies on topics you’ve started. The member who gave the solution and all future visitors to this topic will appreciate it!
These simple actions take just seconds of your time, but go a long way in showing appreciation for community members and the Live Community as a whole!
The Live Community thanks you for your participation!