Im involved in a project to migrate away from old asa firewalls to a palo solution.
The process has gone well but myself and peers are stumped with an odd issue and looking for troubleshooting advice.
We have a number of https hosts in a dmz, nat'ed to be available to the public internet. systems from all over the world can access these https hosts fine. That is.... except for a small group of individuals all using a particular local ISP. The trouble is, this issue did not exist with the previous ASA firewalls.
The security policy is simple, allow tcp 443 in. People with this issue will intermittently have difficultly fully loading an https site behind our palos. Pages will partially load or time out. Chrome debug logs from the end users perspective show timeouts on connection attempts.
If a user with the issue retries enough they will eventually establish a session and have no troubles. The https hosts in our DMZ show connection attempts from users on this ISP and from their perspective traffic seems to just stop.
No drops or blocks appear in threat or security logs on the palo and we dont see this behavior with any other people from other ISPs.
The ISP has insisted they are not mangling traffic and from what Ive seen I agree but the problem persists. Ordinarily Id write this off as an ISP doing something odd but the problem didnt exist when our prior ASA firewalls were in place.
The issue feels like a content inspection or ssl decryption issue but we've confirmed thats not in play. We've also confirmed any outside IDS systems are not mangling or dropping traffic. We dont see any asyc routing in play so it does not appear the palo is loosing state and dropping traffic. Everything from what we see appears the same as it was but still we continue to have issues with people using this one particular ISP
Myself and peers are looking for further troubleshooting insights or advice.
Any advice would be greatly appreciated.
On your NAT policy, do you have it setup as bi-directional? That could be the asymmetric routing issue you are referring to. Also check this article out for Direct Server Retrun:
Bi Directional NAT:
Just some thoughts.
Secondly, you may not be seeing dropped logs if you haven't clicked on the interzone-default and "overriden" it to log. What is the allow rule traffic is touching on the palo? If you have the service set to application-default, clone the rule, and underneath set to any port just to see if any traffic is switching ports in the lifecycle of the session.
Thanks for the reply. Fair question and suggestions. Yes the NAT is bidirectional and we arent load balancing between multiple upstream ISPs.... Theres only one return route back to problem clients. The issue does resemble that however we are looking to find some kind of proof. We're struggling to understand why clients on this one particular local ISP is having difficulties when no one else is.
Logging intrazone-default is something we hadn't considered. Its worth a shot to try. Thanks.
As for the allow rule; The current rule is:
allow any external ip , tcp 443 to the nat'ed dmz address, application type ssl
That works for every external client except for clients using this one particular ISPs service.
Ive gone so far as to push an Allow any protocol, port, zone from a given source IP (including any application) to the top of the rule base and made no impact to the problem.
Click Accept as Solution to acknowledge that the answer to your question has been provided.
The button appears next to the replies on topics you’ve started. The member who gave the solution and all future visitors to this topic will appreciate it!
These simple actions take just seconds of your time, but go a long way in showing appreciation for community members and the LIVEcommunity as a whole!
The LIVEcommunity thanks you for your participation!