Ha2 is going down every 5 6 days .. .my palo version is 10.2.9.h11

Aftab_786 · ‎04-16-2025

I am facing this issue in my current palo alto where my palo alto firewall keep on flapping after evry 5 6 days like suddenly seconday device is getting active and after sometimes it is coming back to normal not always it is coming to normal sometime we have to reboot the primary firewall, when checked thew system logs

and ( description contains 'all_pktproc_28: Exited 4 times, must be manually recovered.' )

and ( description contains 'tasks: Exited 1 times, must be manually recovered.' )

and ( description contains 'HA Group 10: Dataplane is down: brdagent exiting' )

and ( description contains 'HA Group 10: Moved from state Active to state Non-Functional' )

and ( description contains 'Chassis Master Alarm: HA-event ' )

and ( description contains 'HA2 link down' )

and ( description contains 'All HA2 links down' )

and ( description contains 'all: Exited 1 times, must be manually recovered.' )

and ( description contains 'data_plane_0: Exited 1 times, must be manually recovered.' )

and ( description contains 'all: Exited 1 times, must be manually recovered.' )

and ( description contains 'HA Group 10: Peer HA2 keep-alive down' )

and ( description contains 'HA Group 10: All HA2 keep-alives are down' )

and ( description contains 'Internal packet path monitoring failure, restarting dataplane' )

and ( description contains 'dp0-path_monitor: Exited 1 times, must be manually recovered.' )

and ( description contains 'internal_monitor: Exited 1 times, must be manually recovered.' )

and ( description contains 'HA Group 10: Local HA2 keep-alive down' )

and ( description contains 'Chassis Master Alarm: Cleared' )

Can some one suggest what could be the possible reason and how to fix this

kiwi · ‎04-17-2025

Hi @Aftab_786 ,

Sounds like you might be running into a memory leak issue.

Check process utilisation at the time of the dataplane restarts.

My guess would be that some process is building up utilization over the 5-6 days until no more resources are available and the DP decides to restart after which the problem is "fixed".

Recommend to generate a TSF from the device and submit it to TAC for analysis for confirmation.

Kind regards,

-Kim.

LIVEcommunity team member, CISSP
Cheers,
Kiwi
Please help out other users and “Accept as Solution” if a post helps solve your problem !

Read more about how and why to accept solutions.

Aftab_786 · ‎04-17-2025

Thanks a lot for the response sir...Unforunatelty we have lost the support and company is not in mood to renew...any other suggesstion to over come this issue without going to support will be much helpful

kiwi · ‎04-18-2025

Hi @Aftab_786 ,

If this is a memory leak issue, which I suspect it is, then I'm afraid only support will be able to provide a fix for it as code will probably have to be adjusted.

As a workaround, you could try to identify the process that is affected by the memory leak and restart the process preemptively during a maintenance window. That way you can do controlled restarts of the process and you are not taken by surprise when the DP restarts as a result of the memory leak.

Kind regards,

-Kim.

LIVEcommunity team member, CISSP
Cheers,
Kiwi
Please help out other users and “Accept as Solution” if a post helps solve your problem !

Read more about how and why to accept solutions.

Unlock your full community experience!

Ha2 is going down every 5 6 days .. .my palo version is 10.2.9.h11

Ha2 is going down every 5 6 days .. .my palo version is 10.2.9.h11

Show your appreciation!