I am getting the below message ,there is no specific interval (around 30 min)
it countinously happens from three days .
What is the action need to be taken
HA Group 1: Local HA2 keep-alive up
HA Group 1: All HA2 keep-alives are down
HA Group 1: Local HA2 keep-alive down
How do i begin the troubleshooting ?
This will cause any issue ?
I could see the dataplane cpu going very high right after the keep alive up ?
What exactly ha2 doing
during this time , i can see the link ha2 is up
the deployment is active active ,
Both PA was showing the same error before the seondary pa rebooting
after rebooting the secondary showing the below logs but the primary showing the same above .
I'm also getting same,
1) Observed data plane under severe load message in the system logs.
2) Within a second, got “All HA2 keep alives are down” message
3) HA keep alives were down for exactly 4 seconds.
Can anyone suggest what is the troubleshoot methods and how can I findout a RCA.?
Could you help us here.
Thanks & Regards,
What ports are you utilizing for HA2?
I was wondering the same. What is your hardware type @Sethupathi @simsim ? If you're not using the dedicated HA2 port and the box is under heavy utilization the HA2 port and it's functionality will occur via the DP. With the DP being under load it's possible HA functionality could be affected.
Having dedicated HA ports and which is directly connected.
It has been noticed that , Total management plane consumed disk space i.e. root partition is almost full.
Filesystem Size Used Avail Use% Mounted on
/dev/md3 3.8G 3.3G 295M 92% /
/dev/md5 7.6G 4.1G 3.1G 58% /opt/pancfg
/dev/md6 3.8G 2.0G 1.7G 55% /opt/panrepo
tmpfs 2.0G 116M 1.9G 6% /dev/shm
cgroup_root 2.0G 0 2.0G 0% /cgroup
/dev/md8 198G 44G 144G 24% /opt/panlogs
tmpfs 12M 0 12M 0% /opt/pancfg/mgmt/lcaas/ssl/private
Note:I'm having PA-5050
Could any one help us here.
I wouldn't be worried about the dev/md3 partition being at 92%, that wouldn't cause your HA issues at all. If you are utilizing the dedicated HA1 and HA2 ports and not configuring another interface as a high-availability interface, high dataplane utilization shouldn't cause the HA interfaces to drop out. I would open a ticket with TAC so they can look at the associated logs and see exactly what was going on at the time the interfaces dropped.
Did you ever find a solution to the issue you mentioned? I'm currently running into a very similar issue with a 5050 active-passive cluster running on PAN-OS 8.1.10. We've been working with TAC for the past few days and haven't had any resolution. Any info would be much appreciated.
As it only happened once we closely kept monitoring the cluster behavior in case it would re-occur...but it didn't. No support ticket was opened.
I will keep you posted if it happens again. We are still running PANOS 8.1.9.
Please let me know the outcome of the support ticket.
Click Accept as Solution to acknowledge that the answer to your question has been provided.
The button appears next to the replies on topics you’ve started. The member who gave the solution and all future visitors to this topic will appreciate it!
These simple actions take just seconds of your time, but go a long way in showing appreciation for community members and the LIVEcommunity as a whole!
The LIVEcommunity thanks you for your participation!