VM100 keeps rebooting

DavePalo · ‎08-14-2014

Hi all,

We have a Palo Alto VM-100 running under ESXi 5.0 which up until this week has been rock solid.

On Monday it rebooted itself. No config changes had been made for almost a month prior to this. It also rebooted itself twice yesterday and once so far today.

The messages in the log are as below. Order is from bottom to top.

Autocommit job failed

Dataplane is now up

The system is starting up.

The system is shutting down.

data_plane: restarts exhausted, rebooting system

The dataplane is restarting.

supervisor: Exited 1 times, must be manually recovered.

tasks: Exited 1 times, must be manually recovered.

all_task_2: Exited 4 times, must be manually recovered.

PAN-DB cloud list loading failed (ERROR:Couldn't resolve host name).

all_task_3: exiting because missed too many heartbeats

all_task_2: exiting because missed too many heartbeats

all_task_3: exiting because missed too many heartbeats

all_task_2: exiting because missed too many heartbeats

all_task_3: exiting because missed too many heartbeats

all_task_2: exiting because missed too many heartbeats

all_task_3: exiting because missed too many heartbeats

all_task_2: exiting because missed too many heartbeats

I logged a call over 24 hours ago with our support company but so far nobody has been able to offer any assistance at all.

I'm hoping somebody on here has maybe seen this before? Any help would be very much appreciated!

Many thanks,

Dave

Palo Alto VM-100

Software version 5.0.11

Application version 450-2330

Antivirus version 1346-1817

URL Filtering version 2014.08.13.411

Retired Member · ‎08-25-2014

you may try to reset from Maint mode if you have config backup and don't mind about logs...

I saw this problem in my VM too..I fixed with reverting back..

hshah · ‎08-25-2014

Hi Dyoung,

Provide us output for "show system files". That will confirm if reboot has generated any crash/core files. If yes, it would be easy to find out root cause.

Regards,

Hardik Shah

Tician · ‎09-29-2014

Hi Dave,

I'm faced with same problem on VM-100 after I upgraded on 5.0.14. Can you have any response from support...?

Regards,

Predrag

ssharma · ‎09-29-2014

Hi DYoung,

Can you confirm if you are using AMD processor on you VM machine, if that is the case, we have a known issue that we have identified and the fix is scheduled to be in upcoming release. If you are not using AMD and are still seeing multiple crash I would suggest you to open a case with PA support to further analyze. Hope this helps. Thank you.

Tician · ‎09-30-2014

Hi ssharma,

No, my vm laying on Intel server infrastructure, so I think that definitely not an issue. But definitely I found the cause of this crash and behavior before that. One of my GP clients after successful connection to gateway, I don't know how, but only his GP client and computer initiate this "exiting because missed too many heartbeats", then "Exited 4 times, must be manually recovered", then "The dataplane is restarting" and on the end "data_plane: restarts exhausted, rebooting system".

We tested this to be sure and only this client doing this..... First seen on version 5.0.14, but same thing on 6.0.5. I opened support case with local certified distributor and supplier.

case one>

case two>

case three>

Always after he connect to GP gateway......very very strange...

Regards,

Predrag

ssharma · ‎09-30-2014

Hi Predrag,

That was nice observation but still wired that one particular user would cause it to crash. Since you have already opened a case, engineer should be able to find root cause and possible solution/workaround. Please update this thread once you have answer from engineer so that other users can also look at it. Thank you.

DavePalo · ‎02-06-2015

Hi all,

Thank you for your replies. I'm sorry I haven't responded sooner - I've not logged in for a while.

Our VM-100 rebooting stopped after we upgraded to 5.0.14. Back in August this was escalated all the way up to Palo Alto development and still no reason for these reboots could be found.

The case was closed because the problem seemed to have gone away rather than the root cause discovered.

Unfortunately on Tuesday this week our VM-100 started rebooting again. I have upgraded to 5.0.15 in the faint hope that this will help and contacted our support partner, but not heard anything back yet.

Did any of you discover the cause of this issue please? We are not using GP so I don't think it can be related to that.

Also, we are running on Intel Xeon E5-2650v2 and show system files shows no files - just an empty crashinfo directory.

Many thanks,

Dave

Tician · ‎02-06-2015

Hi dyoung,

yes problem revealed and isolated. My support case lasted to long but in the end, support said that they had reproduced crash in testbed environment while tracing some other case.

There is they answer:

A time-of-check-to-time-of-use race condition causes a buffer overflow that trashes a mutex. The mutex will not get unlocked causing a crash.

A fix was coded to fix this race condition & buffer overflow.

Our Q&A team is currently testing this code and once approved this code will be introduced in the new PanOS versions. I will update you as soon as I get more confirmation on when we can expect this fix to be released.

The fix is scheduled for release with PanOS version 6.1.3 which should be released somewhere near mid-March.

Backport to 6.0.x software is still pending.

Regards,

Predrag

DavePalo · ‎02-09-2015

Many thanks Predrag,

It is good to hear that support have now reproduced this!

Did they give any indication of what was might be triggering the issue please?

Chasing up my support people now as still waiting to hear back from them...

Regards,

Dave

Tician · ‎02-09-2015

Hi Dave,

No they didn't. From case history and progress pane I can see that issue was registered with bug numbers 69130 and 61575. I guess that this bug traces will be covered with comments on next PAN OS releases.....

Regards,

Predrag

DavePalo · ‎02-09-2015

Hi Predrag,

Thanks for this. Hopefully I can use it to steer our support people in the right direction to try and get a bit more info.

Best regards,

Dave

Unlock your full community experience!

VM100 keeps rebooting

VM100 keeps rebooting

Show your appreciation!