I have an IPSec tunnel with source address NATting to a partner. 443 web traffic from site A triggers the IKE and the IPSEC-SA session. In PAN monitoring the application is correctly identified as SSL and in my browser I pull up the site from the partner without an issue.
443 web traffic from site B triggers IKE and IPSEC-SA so that tunnel is ostensibly up. BUT the clients get a message that the site can't be reached in their browser "took too long to respond". And in monitoring on the PAN the Application status is "incomplete". However the record shows the src address was properly NATted and allowed just like site A. Site B and site A have different address blocks but get NATted and are permitted with the same policy.
Can anyone recommend the right tools (pcap) for me to see exactly where the conversation from site B is failing? Any other thoughts appreciated.
So you have 2 sites- SiteA and SiteB and partner seems to be third - SiteC.
Application is identified as SSL only after TCP 3way handshake is done and SSL negotiation starts.
So most likely SiteB sends SYN but no SYN ACK is returned to continue conversation to SiteC.
It can be either security policy not permitting traffic at SiteC from SiteB or return route is missing in SiteC towards SiteB local subnet (in case there is route based VPN solution at SiteC like Palo).
As @Raido has already mentioned "incomplete" in application tab 99% means no 3-way handshake for TCP communication.
You can do PCAP following the guide below:
Do you know if your server knows how to get to the site B when responding to the SYN request? Do you know if SYN packets from site B actually hitting the server?
Monitor > Traffic
Add Egress interface, Packets sent and Packets received columns.
If you see packets sent 1, packets received 0 and egress interface is correct tunnel interface towards SiteC (partner) then issue is at other side.
I added those colums. But everything is the same except for the number of bytes and packets are far smaller for the failed conversations than the successful. e.g. in my first successful conversation it's 5k while user from other site 540 bytes.
The ingress and egress (tunnel.x) are the same. We NAT to the same source address to the server at the far end would hear traffic coming from the same source. A difference is that the failed conversations only end with a tcp-rst-from-server where as my good conversations have many tcp-rst-from-server.
I ran a packet capture and it looks to me like the bad and the good both have a SYN, SYN ACK, ACK. But in my case the next packet is a SYN for TLS Hello and then a whole TLS exchange occurs. Based on that I made sure the user had TLS enabled, tried new browser, tried a new operating system (win 10, win 7, explorer, chrome, ff) - they all have the same result. Application incomplete and TLS Hello is never sent to the server. I've looked through syslogs for rejected traffic. Still no go.
Another general difference when I look at wireshark is that the pcap from the failed situation is mostly all gray with some black/red where as my good is light blue for the most part.
That is weird.
3way handshake happens and client does not initiate SSL session.
This packet capture is taken on firewall right?
Can you install Wireshark and take packet capture on problematic workstation.
Do you see same result?
Sorry I was imaging this connection differently. So both clients at the site A and B using the same tunnel in order to get to teh https website. If 3 way handshake ok them better to run pcap from the client side as @Raido has suggested and compare if need be between the one you took from the palo
I think I have the issue resolved. It looks like before it hits the tunnel firewall some egress traffic goes through another firewall. But that portion does not return though that firewall. So the three way handshake succeeds as the synack makes it back to the client but TLS fails for the integrity of the conversation as a whole. Waiting for approvals of some PBR changes to verify my theory.
So that solved it. The reason is was more difficult to troubleshoot is that our logging is at the end of a conversation so there was no evidence that traffic I was sending to the remote site was passing through this one PAN before the tunneling PAN. Once I modified PBR on an adjacent router to avoid the 1st PAN then the conversation was able to maintain integrity to the SSL server.
Lessons for me to remember: Log end of session means you have no visibility on one way traffic through your FW. (Tho perhaps I could have seen it in a packet capture)
"show route" on a Cisco router does not inform you that PBR may be happening and over-riding what you see in your route table.
Click Accept as Solution to acknowledge that the answer to your question has been provided.
The button appears next to the replies on topics you’ve started. The member who gave the solution and all future visitors to this topic will appreciate it!
These simple actions take just seconds of your time, but go a long way in showing appreciation for community members and the LIVEcommunity as a whole!
The LIVEcommunity thanks you for your participation!