- Access exclusive content
- Connect with peers
- Share your expertise
- Find support resources
08-09-2024 05:43 AM
Hello,
I have a case with TAC regarding this issue for 4 months now so I figured I would try my luck here.
My issue is a HA split-brain problem between Panorama Active and Passive appliances in 2 different physical locations.
The issue is that HA1 breaks for 1 second briefly because: health-check fails -> ICMP packets are not sent from primary devices -> buffer memory for ICMP packets is unavailable.
The failover is NOT happening because it buffer unavailability is very brief but still enough to generate Critical Alerts and mess up our monitoring systems (but no production impact).
less mp-log ha_agent.log
2024-06-03 08:19:41.162 +0200 Error: ha_ping_peer_miss(src/ha_ping.c:764): Missed 1 ping timeouts out of 3 (ha1)
2024-06-03 08:19:43.163 +0200 Error: ha_ping_send(src/ha_ping.c:604): Unable to send icmp packet:(errno: 105) No buffer space available
2024-06-03 08:19:43.163 +0200 Error: ha_ping_peer_miss(src/ha_ping.c:764): Missed 2 ping timeouts out of 3 (ha1)
2024-06-03 08:19:43.599 +0200 Received HA1 MAC address: 00:50:56:a4:22:b9
2024-06-03 08:19:43.630 +0200 Received HA1 MAC address: 00:50:56:a4:22:b9
2024-06-03 08:19:44.251 +0200 debug: ha_peer_recv_error(src/ha_peer.c:5781): Group 0 (HA1-MAIN): Receiving error message
Msg Hdr
-------
version : 1
groupID : 0
type : Error (5)
token : 0x702a
flags : 0x1 (req:)
length : 59
Error Msg
---------
flags : 0x2 (close:)
err code : Heartbeat ping failure (16)
num tlvs : 1
Printing out 1 tlvs
TLV[1]: type 5 (ERR_STRING); len 23; value:
48656172 74626561 74207069 6e672066 61696c75 726500
2024-06-03 08:19:44.251 +0200 Warning: ha_event_log(src/ha_event.c:59): HA1 connection down
2024-06-03 08:19:44.253 +0200 Error: ha_peer_primary_link_switchover(src/ha_peer.c:2531): Group 0: Unable to find a primary interface to switch
Palo Alto has documentation for split-brain but all documentation is for NGFW Physical appliances and not Panorama.
Does anybody have any suggestions how to fix this issue? Thanks for any help!
Click Accept as Solution to acknowledge that the answer to your question has been provided.
The button appears next to the replies on topics you’ve started. The member who gave the solution and all future visitors to this topic will appreciate it!
These simple actions take just seconds of your time, but go a long way in showing appreciation for community members and the LIVEcommunity as a whole!
The LIVEcommunity thanks you for your participation!