Has anyone had a PA-3050 stop processing traffic? Our PA-3050 started dropping all traffic today (internet access, DMZ, etc.), we failed over to the standby unit and were able to restore service.
Currently we have a support ticket opened but wanted to know if anyone here has had a similar experience. Thanks!
We are having similar issue. 3 times in 2 weeks primary 3050, running 5.0.8 stops passing all traffic in v-wire mode and won't fail-over automatically. We forced fail-over to passive box, rebooted fw1, and failed back to it just to have it happen again a few weeks later. We are currently pushing all traffic through the backup 3050 until PAN comes up with recommendation, or fix.
I've had this happen several times as well, last one today. We also have a HA cluster, and the primary (and active) firewall suddenly stops routing traffic, and the secondary (passive) does not try to take over, and acts like everything is OK.
Had to reboot the primary firewall for the secondary to become active, and I would like to avoid this happening again.
I've sent tech dumps to our support contact, so hopefully I'll get a better answer than "I'm sure this won't happen again" this time. Running 5.0.11, by the way.
Could you please verify PAN brdagent log through CLI command and verify if matching with below mentioned symptoms.
A. PAN> less dp-log brdagent.log
1. Error message like : "need to reset ocelot link as XX error packet seen"
2. link flap messages similar to above.
3. XGE link error.
4. Check system logs for critical logs indicating “DP packet descriptor leak detected on slot 1 dp0” or similar.
Jumbo frames are not active for our HA links. Forgot to mention that this is a 3020 cluster.
After contacting support their best suggestion was to update to 5.0.12 or 6.0.2.
I have upgraded both to 5.0.12 tonight, but I have a feeling that the bug fixed in 6.0.2 better describes our situation, so I will upgrade the firewalls to 6.0.2 tomorrow.
Fixed in 5.0.12/6.0.1: When High Availability Active/Passive peers lost communication on HA1 and HA2 links, a race condition caused the dataplane to restart.
Fixed in 6.0.2: Fixed an issue with PA-3000 Series devices where traffic could stop passing through the firewall or the dataplane could restart due to an internal path monitoring failure.
Apparently we had encountered the same behavior on our 3020 cluster running 6.0.0. The device stopped processing any traffic and there was no evidence of what happened in logs. Restarting dataplane seems to be a working solution for us, yet i'd like to know what was the cause.
I also did verify dp-log brdagent.log as HULK mentioned above and we had these messages:
- Flapping Ocelot link 1
- Error: poll_func(3000/osprey_oct.c:164): Need to reset ocelot link as 51 error packets seen!
Does it mean i should run debug dataplane fpga set sw_aho yes command on our boxes or simply go for 6.0.2?
This was the suggestion given us by support as well.
However, they also did say that this change will be lost each time we reboot a firewall, which I would concider less handy and make this a not-so-much as permanent fix.
What we're going to do from now on is also have a computer connected to the failing device via the console, and keep it logging in case the issue occurs again.
It's been under a month since it last happened (PANOS 5.0.11) and the issue still persists on PANOS 6.0.2.
If you have any further suggestions, I would love to hear from you.
Click Accept as Solution to acknowledge that the answer to your question has been provided.
The button appears next to the replies on topics you’ve started. The member who gave the solution and all future visitors to this topic will appreciate it!
These simple actions take just seconds of your time, but go a long way in showing appreciation for community members and the LIVEcommunity as a whole!
The LIVEcommunity thanks you for your participation!