PA dropping certain MSSQL EXEC statements for no apparent reason

cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Announcements
Please sign in to see details of an important advisory in our Customer Advisories area.

PA dropping certain MSSQL EXEC statements for no apparent reason

L6 Presenter

Having a weird issue with a remote client connection to over VPN to multiple internal MSSQL servers. A particular SQL EXEC query packet is getting dropped in the middle of an SQL session. Security ruleset allows the communication under a VPN to TRUST mssql-db-unencrypted rule (made a separate test rule with explicit any/any allows and no filtering as well). No packet decryption between source and destination. Session connects, passes multiple queries/responses, and then times out/resets on the client side when the packet drops happens.

 

Security logs show traffic allowed thru expected rules with no problems, no alerts, not decrypted. No relevant threat logs. Packet dumps on the PA show the client making multiple SQL queries and the server responding, then the client makes a EXEC command and repeats it multiple times before timing out/resetting. Packet dumps from the core router behind the PA show the multiple queries/responses, but not the final EXEC query, the PA has dropped the packet without indicating why. Anyone seen this? Anywhere else to look for errors?

 

  --> client login

  <-- server response

  --> SET LOCK_TIMEOUT 10000

  <-- done

  --> select SERVERPROPERTY(N'servername')

  <-- done

[ multiple declaration and select statements back and forth to server getting ready for scripted job ]

  --> exec msbd.sp_help_job @job_id='<GUID>'

  <-- done

  --> EXEC msdb.sp_start_job @job_id=N'<GUID>'

[ no response from server - internal packet dump shows this packet never passed by the PA ]

  --> [ multiple retransmissions ]

[ no response from server - internal packet dump shows these packet never passed by the PA ]

  -> TCP RST

 

13 REPLIES 13

Cyber Elite
Cyber Elite

@Adrian_Jensen,

Just to rule out L7 processing, I'd make an application-override to a custom app-id just to kill off any and all inspection through the firewall. Then have them run it again with all profiles disabled on a test rule for the traffic and see if that doesn't allow the traffic to pass without issue. 

Hi @Adrian_Jensen ,

Have you checked global counters with packet filters matching only this specific session?

 

Created a custom app override and applied to security rule, no security profiles (already had tried this as well). Sessions show up in CLI during test in custom app. Traffic logs show session start and end in expected rule with application identified as the override. Packet dumps still show missing exec start job MSSQL/TDS packet from client to server behind the PA...

I'm not quite sure I understand what you mean by checking global counters. Do you mean the traffic log packet counters vs actual packets in the traffic dumps?

 

From the traffic logs:

2022-06-07_101819.png

 

PA packet capture, from port 54840 to server:

    - 41 packets from client (including packet that goes missing, 6 retransmissions, and TCP RST)

    - 27 packets from server.

 

Internal router packet capture, from port 54840 to server: 

    - 33 packets from client (missing final exec job start, 6 retransmissions, TCP RST)

    - 38 packets from server (10 TCP keepalives because the server never got the ACK that should have come in the missing packet and a TCP RST do to no ACK responses)

 

PA packet capture, from port 54843 to server:

    - 16 packets from client (including packet that goes missing, 6 retransmissions, and TCP RST)

    - 6 packets from server.

 

Internal router packet capture, from port 54843 to server:

    - 8 packets  (missing final exec job start packet, 6 retransmissions, TCP RST)

    - 17 packets from server (10 TCP keepalives and a TCP RST)

 

PA seems to drop the exec job start packet and all subsequent traffic in the session (though the session still seem to be active in the CLI).

Community Team Member

NGFW 

LIVEcommunity team member
Stay Secure,
Jay
Don't forget to Like items if a post is helpful to you!

Please help out other users and “Accept as Solution” if a post helps solve your problem !

Read more about how and why to accept solutions.

Community Team Member

Hi @Adrian_Jensen ,

 

I believe @aleksandar.astardzhiev means to check the global counters as explained here to help you isolate the issue.

 

With the command "show counter global" you will see ALL the counters for ALL the traffic so it's best to setup a filter to isolate the counters which can help you troubleshoot an issue.

 

NGFW 

 

Cheers,

-Kiwi.

 
LIVEcommunity team member, CISSP
Cheers,
Kiwi
Please help out other users and “Accept as Solution” if a post helps solve your problem !

Read more about how and why to accept solutions.

Hey @Adrian_Jensen ,

Apologies for not being very clear. Please try to follow the steps from the link provided by @kiwi , I am interested to see the output.

When you created the custom application for the override, did you select parent application, or leave it with "None"?

Ah... interesting... I had no idea you could use the packet monitor filter settings to filter global counters to a specific source/destination.

 

Did another test and packet capture (using the receive stage with a filter for source/destination on any ingress interface) - capture shows same details as previous tests. This time I also added a packet monitor for the drop stage - no resulting captured packets.

 

adrian.admin@PA-3020-Dr> show counter global filter packet-filter yes delta yes

Global counters:
Elapsed time since last sampling: 69.130 seconds

name value rate severity category aspect description
--------------------------------------------------------------------------------
pkt_outstanding 416 6 info packet pktproc Outstanding packet to be transmitted
pkt_alloc 490 7 info packet resource Packets allocated
pkt_inconsist 129 1 info packet pktproc Packet buffer pointer inconsistent
session_allocated 4 0 info session resource Sessions allocated
session_installed 4 0 info session resource Sessions installed
session_unverified_rst 2 0 info session pktproc Session aging timer modified by unverified RST
flow_fwd_mtu_exceeded 55 0 info flow forward Packets lengths exceeded MTU
flow_dos_rule_nomatch 4 0 info flow dos Packets not matched DoS policy
flow_ipfrag_frag 110 1 info flow ipfrag IP fragments transmitted
flow_host_pkt_xmt 300 4 info flow mgmt Packets transmitted to control plane
flow_host_vardata_rate_limit_ok 245 3 info flow mgmt Host vardata not sent: rate limit ok
flow_tunnel_ipsec_esp_encap 129 1 info flow tunnel Packet encapped: IPSec ESP
flow_tunnel_encap_resolve 129 1 info flow tunnel tunnel structure lookup resolve
flow_tcp_cksm_sw_validation 116 1 info flow pktproc Packets for which TCP checksum validation was done in software
appid_override 2 0 info appid pktproc Application identified by override rule
appid_proc 2 0 info appid pktproc The number of packets processed by Application identification
dfa_sw 12 0 info dfa pktproc The total number of dfa match using software
ctd_sml_exit_detector_i 2 0 info ctd pktproc The number of sessions with sml exit in detector i
appid_bypass_no_ctd 2 0 info appid pktproc appid bypass due to no ctd
ctd_handle_reset_and_url_exit 2 0 info ctd pktproc Handle reset and url exit
ctd_run_detector_i 2 0 info ctd pktproc run detector_i
ctd_fwd_err_tcp_state 2 0 info ctd pktproc Forward to varrcvr error: TCP in establishment when session went away
aho_sw_offload 15 0 info aho pktproc The total number of software aho offload
ctd_pscan_sw 18 0 info ctd pktproc The total usage of software for pscan
ctd_appid_reassign 2 0 info ctd pktproc appid was changed
ctd_process 2 0 info ctd pktproc session processed by ctd
ctd_pkt_slowpath 12 0 info ctd pktproc Packets processed by slowpath
--------------------------------------------------------------------------------
Total counters shown: 27
--------------------------------------------------------------------------------

 

No drop severity shown in the delta over the test period. Also ran deltas shortly before and after the test with no drops reported. When I created the custom application I set the categories to the same as the normal mssql-db-unencrypted, but I left ParentApp as "none" and Risk as "1".

Hey @Adrian_Jensen ,

I cannot say I fully understand the output, so the only think I noticed is the "Packets lenght exceeded MTU", but I would assume this counter should reflect the packets to the GP client, while the EXEC command should be from GP user, right? This shouldn't be a problem unless DF flag is set, but never the less, can you check what MTU is used by your GP clients? - https://docs.paloaltonetworks.com/globalprotect/5-2/globalprotect-app-new-features/new-features-rele... (around the bottom of the link is mentioned how to confirm MTU value)

 

I would assume it will be difficult to organize, but is there a way you can test if the client is not connected with GP, but still traffic pass through the firewall? Connect a laptop directly to FW and test EXEC command again, or have the user connected to different network that will still route through the firewall to reach the server.

 

Going back to the packet capture - Which stages are you configuring for the packet capture on the firewall? - https://knowledgebase.paloaltonetworks.com/KCSArticleDetail?id=kA10g000000ClTJCA0

As mentioned in the link receive stage will show the packet ingressing the firewall, while transmite stage will show the same packets egressing the firewall. Try to set the two stages to capture in separate, can you confirm that you see the EXEC command packet in the "receive" file, but it is not in the "transmit" file?

L6 Presenter

GP MTU is set to the default 1400. Most of the packets are smaller than that, a few at 1414 bytes (1400 plus ethernet header). The packet being dropped is only 232 total (218 plus ethernet header). there are multiple 200-300 and 1414 byte packets immediately before it that go thru.

 

I was capturing in the receive and drop stage on the PA, and another capture on my core router to/from the destination server. Weirdly... the dropped packet does show up in a transmit stage capture on the PA... Not sure what to make of that when it doesn't show up at the server.

 

Unfortunately, I don't really have a user segment (other than the VPN clients) that go thru the PA to get to the internal DB servers to test the dropped packets. I will have to think a bit if that is possible. The only thing I can think of is from the DMZ, but that isn't really setup for users to do SQL queries from, just DMZ servers to specific internal services.

 

 

 

Hi @Adrian_Jensen 

My idea was to eliminate possible issues with traffic from GP, but if the EXEC command can be seen in the transmit packet capture of the firewall, probably it is better as next step to confirm this.

In your original post you mentiond that you were capturing on the core switch after the firewall. I would assume span/mirror port, right? Which port did you monitor? Can you monitor the port on the switch were the traffic egress the firewall and check if the EXEC command vissible in transmit capture on the FW is also visible ingressing on the switch?

 

What version are you running?

 

L6 Presenter

Yes, my capture on the primary site core switch/router was on a VLAN spanning multiple ports that delivers traffic to our internal ACI/server hosting segment. The PA is at our backup site, connected to our primary site by multiple PtP circuits between the backup site core and the primary site core. I am waiting for the user to test again connected to our primary site PA VPN just to make sure its not dropping across the PtP.... but I don't see what could possibly be affecting it there... Then going to setup more packet captures at ingress/egress of every device in the path.

 

Currently running 9.1.13-h3 at both primary and backup sites.

 

L6 Presenter

Had a chance to run more tests and packet captures, also test thru alternate VPN gateway. Same result, PA shows SQL EXEC packet but doesn't pass to internal router:

  [internet] <-VPN-> [PaloAlto (*1*2)] <--> [Core Router (*3*4)] <--> [SQL Server]

 

*1 - PaloAlto RX capture shows SQL client packets to and from SQL server, last client packet with '"EXEC msdb.dbo.sp_start_job ..." command followed by multiple client retransmission, no further response from server.

 

*2 - PaloAlto TX capture shows SQL client packets to SQL server, including last EXEC packet and restransmissions. Does not show server response packets.

 

*3 - Cisco Core router capture on interface connected to PaloAlto show SQL client packets to and from SQL server. Client packets end at ACK to previous SQL command/response before last client EXEC packet. No retransmissions seen. Server sending keep-alive packets to client.

 

*4 - Cisco core router capture on interface connected to SQL server network. Same packets as *3.

 

No packets in a PaloAlto drop capture. PA seems to be ending the session when it hits the final EXEC statement.

  • 5242 Views
  • 13 replies
  • 0 Likes
Like what you see?

Show your appreciation!

Click Like if a post is helpful to you or if you just want to show your support.

Click Accept as Solution to acknowledge that the answer to your question has been provided.

The button appears next to the replies on topics you’ve started. The member who gave the solution and all future visitors to this topic will appreciate it!

These simple actions take just seconds of your time, but go a long way in showing appreciation for community members and the LIVEcommunity as a whole!

The LIVEcommunity thanks you for your participation!