Palo Alto 7000 heartbeat backup icmp fail

cancel
Showing results for 
Search instead for 
Did you mean: 

Palo Alto 7000 heartbeat backup icmp fail

L4 Transporter

Hello to All,

 

 

From time to time the ICMP fails for the management connection between two firewalls model 7000 with 8.1.x version. The issue causes a failover but the 7000 firewalls have dedicated interfaces for HA and the management should be used only for Heartbeat Backup as described in   https://docs.paloaltonetworks.com/pan-os/8-1/pan-os-admin/high-availability/ha-concepts/ha-links-and... . Shouldn't HA1 and the management connections fail at the same time for a failover? We opened Palo Alto case but they don't answer this question that I asked them (I am waiting for 1 week), they just say change the cable on the management interfaces and customer did that and the issue is still there.

 

ha_agent log:

 

xxx ha 2/17/2021 11:46 critical HA Group 1: HA heartbeat backup connection down xxx
xxx ha 2/17/2021 7:43 critical HA Group 1: HA heartbeat backup connection down xxx

1 ACCEPTED SOLUTION

Accepted Solutions

Cyber Elite
Cyber Elite

Howdy there.

 

So lets talk about HA1 vs Mgmt, and what they are used for.

 

HA1 supports 3 things (link failure notification... HEARTBEAT (icmp) and HELLOs (status checks)

When you enable Mgmt as the HeartBeat Backup, both the HA1 AND the Mgmt IPs are pinged to confirm connectivity.

Is it possible for a ping between HA1 and HA1 is good, but that between Mgmt to Mmgt is is not good? YES.

Would the firewall failover if you unplugged the mgmt interfaces on both FWs?  NO!!  Why.. it is merely a heartbeat backup, and the primary role of determine failure is the HA1 communication. 

So if HA1 to HA1 failed.. would the FW failover over NO!  Because pings from the mgmt to mgmt  IPs were still up. The reason for programming Mgmt IP as HeartBeat back up is to prevent Split Brain.

 

In your environment, maybe it is better to use HA1-A and HA1-B (for backup) and not use mgmt IP for backup.

 

let us know if there are questions we can assist more with.

 

Thanks

 

Help the community: Like helpful comments and mark solutions

View solution in original post

3 REPLIES 3

Cyber Elite
Cyber Elite

Howdy there.

 

So lets talk about HA1 vs Mgmt, and what they are used for.

 

HA1 supports 3 things (link failure notification... HEARTBEAT (icmp) and HELLOs (status checks)

When you enable Mgmt as the HeartBeat Backup, both the HA1 AND the Mgmt IPs are pinged to confirm connectivity.

Is it possible for a ping between HA1 and HA1 is good, but that between Mgmt to Mmgt is is not good? YES.

Would the firewall failover if you unplugged the mgmt interfaces on both FWs?  NO!!  Why.. it is merely a heartbeat backup, and the primary role of determine failure is the HA1 communication. 

So if HA1 to HA1 failed.. would the FW failover over NO!  Because pings from the mgmt to mgmt  IPs were still up. The reason for programming Mgmt IP as HeartBeat back up is to prevent Split Brain.

 

In your environment, maybe it is better to use HA1-A and HA1-B (for backup) and not use mgmt IP for backup.

 

let us know if there are questions we can assist more with.

 

Thanks

 

Help the community: Like helpful comments and mark solutions

View solution in original post

Thanks for confirming this as I suspected this is the case and I tried to explain this to the TAC engineer and asked him to confirm this but for 1 week we have no reply to this question as the case priority is HIGH and still (not happy with the TAC support). About the issue we are partners and we have access to Auto Assistant which detected internal packet failure on one of the firewalls, which suggests a hardware issue. I also found bug PAN-114648, that is resolved for 3200 devices but I don't know if it affects 7000 devices: https://docs.paloaltonetworks.com/pan-os/8-1/pan-os-release-notes/pan-os-8-1-addressed-issues/pan-os.... I also asked the TAC for their opinion for the Internal Packet failure on one of the firewalls and this bug but I am still waiting. Still thanks for confirming what I suspected and replying much faster than the TAC

I meant internal path failure. Sorry.

Like what you see?

Show your appreciation!

Click Like if a post is helpful to you or if you just want to show your support.

Click Accept as Solution to acknowledge that the answer to your question has been provided.

The button appears next to the replies on topics you’ve started. The member who gave the solution and all future visitors to this topic will appreciate it!

These simple actions take just seconds of your time, but go a long way in showing appreciation for community members and the LIVEcommunity as a whole!

The LIVEcommunity thanks you for your participation!