08-27-2012 10:28 AM
So, I recently ran into an issue and I wanted to try to see if I could get some feedback from users to see if anyone else had something similar happen to them.
We recently ran into an issue where our active firewall tanked and transferred responsibility to it's peer. Everything was working as it should, so i contact support to check what the issue could have been. After looking at the tech support files, they discovered that it's a memory leak issue in the 4.1.5 release and that we should upgrade to 4.1.7 because apparently it fixes "hundreds of memory leak issues". So, we upgraded and everything was working fine...for about 2 hours. I tried accessing the CLI and GUI of the active firewall but I was unable to. However, the passive was working fine AND the data plane on the active was still working as well. After doing a tac-login with a challenge/response for the tech to have root access the my box, he was able to restart the authd service because there's yet another race condition issue with 4.1.7 where there are lots of log queries happening at the same time which causes the authd service to fail. This is were the h2 or hotfix 2 comes in and fixes the issue.
Is it me, or is it every time that palo alto releases a new code version that they break something in the previous release that was once working? I've been dealing with this exact scenario since 4.0.x days, and frankly, it's getting annoying having to upgrade our firewalls every 6 weeks when they release a new code.
09-26-2012 01:07 AM
Thinking of the following fix in 4.1.8?
43575 – After upgrading to PAN-OS 4.1.7, the management interfaces (web interface and
CLI) sometimes stopped responding due to a conflict between the authentication and
09-26-2012 03:13 AM
That's the reason I upgrade from 4.1.7 to 4.1.8...
09-27-2012 02:51 AM
I also had unresponsive management (GUI, CLI and Serial) on the active box of a 2050-cluster just a few days after upgrading to 4.1.7.
A reboot was the only solution.
The issue did not occur yet any of our 4000 or 5000 clusters that where upgraded ot 4.1.7.
Can I conclude that issue is only on the 2000 series ?
Is it possible to restart the authd or mgmt service on the active unit via the CLI of the passive unit ?
It woud be nice to have a techdoc from the TAC on this issue, which seems to impact lots of people ...
09-27-2012 03:03 AM
The loss of management CLI /GUI access is a software issue accross all the hardware platforms and not just PA 2000..
A possible workaround is to restart the masterd deamon by logging into the shell.
Please contact TAC by opening a ticket since it will require a challenge/response.
09-27-2012 03:18 AM
A possible workaround is to restart the masterd deamon by logging into the shell....
Sorry to say, but that's impossible since the CLI is unresponsive... doesn't matter if you telnet or connect via serial.
Click Accept as Solution to acknowledge that the answer to your question has been provided.
The button appears next to the replies on topics you’ve started. The member who gave the solution and all future visitors to this topic will appreciate it!
These simple actions take just seconds of your time, but go a long way in showing appreciation for community members and the LIVEcommunity as a whole!
The LIVEcommunity thanks you for your participation!