Upgrade to PAN-OS 8.0.11 causes device restart loop

cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Announcements
Please sign in to see details of an important advisory in our Customer Advisories area.

Upgrade to PAN-OS 8.0.11 causes device restart loop

L4 Transporter

I performed an upgrade on a HA Pair of PAN-5220 firewalls from PAN-OS 8.0.7 to PAN-OS 8.0.11 and once the firewalls booted up they would run for about 5 minutes, alarm (red LED on device) and then reboot, over and over and over.  Even with only one firewall running on PAN-OS 8.0.11, it would eventually alarm and reboot.  Thankfully the devices stayed up long enough to revert the software back to PAN 8.0.7, which came up without any issues.  

 

I know its newer software, but it's a point release and now recommened due to Palo Alto Networks Security Advisory - PAN-SA-2018-0003.  I am going to open a ticket tomorrow with Palo Alto for follow up but I would certainly be wary of this version if you have similar configuration.

 

- Matt

13 REPLIES 13

Cyber Elite
Cyber Elite

Do you know what alarm was actually being thrown by the system? I've upgraded a pair of 3220s to 8.0.11 without any issues, however they were already running 8.0.10. 

I didn't get that far as by the time I drove into the office i was well into my maintenance window and just needed to get the network back up (there should have been no outage due to HA).  I've upgraded tons of HA clusters in the past and never seen this.  Even a single firewall with the HA pair turned off would continuously boot to red alarm and then restart.  

 

In the meantime I opened a ticket with Palo Alto and already am planning to migrationg to 8.0.10 (which I have running on other non-HA firewalls) and see if I have similar issues.  I've hit new bugs on upgrades, but never reboot issues.

 

On a side note, it's good to hear it worked for you.

I just experienced the same kind of problem, PA-5220 and 8.0.11, with freshly upgraded member restarting. I will open a ticket and in the meantime revert to 8.0.10 (which had no problem in the last couple of weeks of uptime).

@michelealbrigo,

If you could pull the logs prior to revert for TAC to take a look at that's likely going to be the most helpful. I haven't heard any rumblings about this being a wide-spread issue but I'll reach out and see if this is something that my contacts that work in TAC are noticing more with 8.0.11. 

I've sent the tech support from 8.0.11, before reverting to 8.0.10, to our partner and they should have opened a ticket. Since I was experiencing some unrelated problems with our log server during the upgrade, and it was unavailable to the firewalls, I might try another round, just to verify if the unavailable log server was part of the equation (e.g. saturation of some kind of buffer on the firewall) and report my findings to support.

Also seen here. Two PA5060s upgraded to 8.0.11 from 8.0.9.

 

First one started running 8.0.11 at 05:13 this morning; data plane went 'bang' at 09:10. Switchover occurred with no users reporting issues fine; second data plane went 'bang' at 11:16. Might just be volume of work related.

 

The specific error in the normal logs is "Dataplane down - too many dataplane processes exited."

I also have a "gdb: 2 tracked gdbs, calling early dp down fail" right before the message you posted. Might be a difference between the 2 platforms, anyway (5000 vs 5200).

Update for the rest: I haven't tested 8.0.11 with a working log server, I've been in contact with TAC and I am providing (or at least trying to provide) them with more data.

Something very similar :-

 

all_pktproc_7: got max gdb failure event, telling all group to restart 

gdb: 2 tracked gdbs, calling early dp down fail

gdb: 3 tracked gdbs, calling early dp down fail

gdb: 3 tracked gdbs, calling failure event

 

I just picked out the "Dataplane down" one out as a good event to pick the time out of.

Hi Everyone,

Just wanted to make everyone aware that PAN pushed out 8.0.11-h1 to address PAN-99380. The engineers believe that the reason the dataplane stopped responding post update was due to how the firewall handled receiving fragmented packets specifically coming across tunnel interfaces.

This slightly explains why some customers experianced the issue while others did not; as the firewalls that I upgraded and the firewalls in my Lab enviroment don't actually have tunnels at all. 

Saw that release, good to have a probable cause and fix.

 

Rob

L0 Member

Hello all

 

Can someone already confirm that the hotfix 8.0.11-h1 works without furhter problems after the upgrade?

 

Thanks

 

 

I am currently using it on an active/standby 5220 pair, no problems at all (let's say it works just as the 8.0.10 did, these deployments are so big in clients number that one can't really get a hold on "micro-problems", anyway: no restarts, I've installed and had it running since a couple of days after -h1 release).

@TBitzi,

Can also confirm that my remote offices are having no issues with 8.0.11-h1. 

  • 10873 Views
  • 13 replies
  • 0 Likes
Like what you see?

Show your appreciation!

Click Like if a post is helpful to you or if you just want to show your support.

Click Accept as Solution to acknowledge that the answer to your question has been provided.

The button appears next to the replies on topics you’ve started. The member who gave the solution and all future visitors to this topic will appreciate it!

These simple actions take just seconds of your time, but go a long way in showing appreciation for community members and the LIVEcommunity as a whole!

The LIVEcommunity thanks you for your participation!