Having a weird issue with a remote client connection to over VPN to multiple internal MSSQL servers. A particular SQL EXEC query packet is getting dropped in the middle of an SQL session. Security ruleset allows the communication under a VPN to TRUST mssql-db-unencrypted rule (made a separate test rule with explicit any/any allows and no filtering as well). No packet decryption between source and destination. Session connects, passes multiple queries/responses, and then times out/resets on the client side when the packet drops happens.
Security logs show traffic allowed thru expected rules with no problems, no alerts, not decrypted. No relevant threat logs. Packet dumps on the PA show the client making multiple SQL queries and the server responding, then the client makes a EXEC command and repeats it multiple times before timing out/resetting. Packet dumps from the core router behind the PA show the multiple queries/responses, but not the final EXEC query, the PA has dropped the packet without indicating why. Anyone seen this? Anywhere else to look for errors?
--> client login
<-- server response
--> SET LOCK_TIMEOUT 10000
--> select SERVERPROPERTY(N'servername')
[ multiple declaration and select statements back and forth to server getting ready for scripted job ]
--> exec msbd.sp_help_job @job_id='<GUID>'
--> EXEC msdb.sp_start_job @job_id=N'<GUID>'
[ no response from server - internal packet dump shows this packet never passed by the PA ]
--> [ multiple retransmissions ]
[ no response from server - internal packet dump shows these packet never passed by the PA ]
-> TCP RST
Just to rule out L7 processing, I'd make an application-override to a custom app-id just to kill off any and all inspection through the firewall. Then have them run it again with all profiles disabled on a test rule for the traffic and see if that doesn't allow the traffic to pass without issue.
Created a custom app override and applied to security rule, no security profiles (already had tried this as well). Sessions show up in CLI during test in custom app. Traffic logs show session start and end in expected rule with application identified as the override. Packet dumps still show missing exec start job MSSQL/TDS packet from client to server behind the PA...
I'm not quite sure I understand what you mean by checking global counters. Do you mean the traffic log packet counters vs actual packets in the traffic dumps?
From the traffic logs:
PA packet capture, from port 54840 to server:
- 41 packets from client (including packet that goes missing, 6 retransmissions, and TCP RST)
- 27 packets from server.
Internal router packet capture, from port 54840 to server:
- 33 packets from client (missing final exec job start, 6 retransmissions, TCP RST)
- 38 packets from server (10 TCP keepalives because the server never got the ACK that should have come in the missing packet and a TCP RST do to no ACK responses)
PA packet capture, from port 54843 to server:
- 16 packets from client (including packet that goes missing, 6 retransmissions, and TCP RST)
- 6 packets from server.
Internal router packet capture, from port 54843 to server:
- 8 packets (missing final exec job start packet, 6 retransmissions, TCP RST)
- 17 packets from server (10 TCP keepalives and a TCP RST)
PA seems to drop the exec job start packet and all subsequent traffic in the session (though the session still seem to be active in the CLI).
Click Accept as Solution to acknowledge that the answer to your question has been provided.
The button appears next to the replies on topics you’ve started. The member who gave the solution and all future visitors to this topic will appreciate it!
These simple actions take just seconds of your time, but go a long way in showing appreciation for community members and the LIVEcommunity as a whole!
The LIVEcommunity thanks you for your participation!