Post 7.0x upgrade intermittend SSL traffic hangs when being decrypted

cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

Post 7.0x upgrade intermittend SSL traffic hangs when being decrypted

L1 Bithead

Hi

 

We have noticed this with two customers and on our own PA's , all of these are PA3020's in a HA a/s setup 

SSL decrypted outbound traffic hangs intermittently for a few minutes and then it starts to pass through again.

 

This happens both with 7.0.1 and 7.0.2 

 

anyone seen this issue as well ?

kinda hard to work with support on this since it's intermittent 

 

regards

Gudmundur

59 REPLIES 59

L2 Linker

Yes, we have a ticket open regarding this right now. What is happening is the FPTCP buffer is filling up and not releasing like it should. Once this happens, the SSL Proxy engine drops packets until the buffer finally clears. We don't yet know what the true cause of the buffer filling up is.

 

You can verify you are experiencing the same bug by SSHing into the firewall and issueing the command "debug dataplane pool statistics". There you will see the line "FPTCP segs" (###): ###/### 

The 3rd number is the max amount of buffer and the 2nd number is how much is left. When the issue occurs, you will notice the 2nd number stuck at 1. Eventually it will release on its own and traffic will flow again. Alternatively if you have an HA pair, you can fail over and it will immediately resolve.

edit: our ticket is 00379855 and the bug id is 84781, if you'd like to reference those with support.

I'm seeing the same thing here on 7.0.1, I was hoping 7.0.2 would've fixed it but I guess not. I just opened a ticket with PA and referenced the ticket @ITCMPHC has. Hopefully it'll be fixed soon, it's quite annoying when it happens.

I've been told this will be fixed in 7.0.3 which is tentatively scheduled for Oct 19. 

Thanks for the update guys 

 

I hope that 7.0.3 will fix this 

 

regards

Gudmundur

We took the plunge and upgraded to 7.0.3 and it doesn't look like this was fixed, at least in our case. I reopened our ticket with PA, we'll see what they say.

I upgraded last night and am still having the issue. I've reported it to support as well.

We are also having the same issue.

 

I've got an open TAC case with log files and information.  Hopefully this will be fix ASAP.  We rely heavily on SSL decryption.

Yeah this is really disappointing especially after they confirmed multiple times that it was fixed in 7.0.3. I don't really want to wait another month+ for the next software release for yet another fix. Sadly there's some other fixes we needed in 7.0.3 that stop us from going back.

I'm still having this same issue as well.

Just as an addition, I had a "dataplane under severe load" happen again this morning.  System reasources applet on the Dashboard didn't register much, but the logs showed the CPUs were at 100% for well over 20 mins.  This is also with 7.0.3, which should have fixed it.  Spent 45 mins on thep hone with support and they gathered the log files to look at causes, but did say that our box should not behave this way given the size of our organization.

 

As we just upgraded from a 3020 to a 5050 to fix this issue, we are NOT happy about this.  Palo has really slipped with the 7.0.x code.  If you spend $100,000, you should NOT expect these kind of issues.

 

Rant over.

Dannon

 

 

 

Support says this is now being "tracked" as part of 7.0.4 which is slated for a December release.

Palo's QA has really taken a back seat recently...

L3 Networker

Yeah, I am disappointed by this bug.  I installed 7.0.0 initially and ran for a month or 2 before seeing 7.0.1.  I also noticed at that time that 7.0.0 had been pulled.  Installed 7.0.1 and ran into issues with the SSL.  Saw 7.0.2 but read about lots of bugs, some new with SSL and dataplane issues too.  Was told by support to NOT go to 7.0.2 and wait for 7.0.3.  Finally went to that and still have the SSL issues.

 

We are very disappointed here with what's been going on.  SSL decryption is a big thing, and to have it borked up is poor QA for sure.

 

On a slightly better note, support got back to me with a temporary work-around to our issue.  Basically, they had me create a new DOS protection profile and policy which only applies to in/out traffic on service-https.  They said someting about it fixing the issue in testing and for other affected customers.  Note this is a temp-fix until they get a proper update rolled out.

 

 

Dannon, would you please post more details on the temp fix?

  • 19342 Views
  • 59 replies
  • 0 Likes
  • 101 Subscriptions
Like what you see?

Show your appreciation!

Click Like if a post is helpful to you or if you just want to show your support.

Click Accept as Solution to acknowledge that the answer to your question has been provided.

The button appears next to the replies on topics you’ve started. The member who gave the solution and all future visitors to this topic will appreciate it!

These simple actions take just seconds of your time, but go a long way in showing appreciation for community members and the LIVEcommunity as a whole!

The LIVEcommunity thanks you for your participation!