Failover Behaviors

Reply
L1 Bithead

Failover Behaviors

Hi All,

 

Setup: Active-Passive

Path Monitoring: enabled, but not configured(nothing under that Path group)

Version: 7.1.14

 

 

Would an Active firewall change its state to non-functional if both of its HA2/HA-Backup goes down?

 

Related Logs:

2019/12/04 09:41:04 critical ha ha2-lin 0 All HA2 links down
2019/12/04 09:41:04 high ha session 0 HA Group 1: Ignoring session synchronization due to HA2-unavailable
2019/12/04 09:41:04 high ha ha2-lin 0 HA2-Backup link down
2019/12/04 09:41:04 critical general general 0 Chassis Master Alarm: HA-event
2019/12/04 09:41:04 critical ha ha2-lin 0 HA2 link down
2019/12/04 09:41:04 critical ha state-c 0 HA Group 1: Moved from state Active to state Non-Functional
2019/12/04 09:41:04 critical ha datapla 0 HA Group 1: Dataplane is down: path monitor failure
2019/12/04 09:41:04 high general general 0 9: path_monitor HB failures seen, triggering HA DP down

 

Also is there an HA Failover table that I could refer so I can reference what is Palo Altos behavior when lets say HA1 fails or HA2 fails etc..

 

Thanks,

John

 

L7 Applicator

Re: Failover Behaviors

did the secondary device go to non-functional ?

 

the primary should not go into a faulty state if the HA2 links go down. the secondary, however, just lost it's capability of taking over seamlessly if the primary were to go down, since it no longer receives session state information.  

 

in case both HA1 links go down, the primary peer will remain active as it will assume the secondary peer went down, the secondary peer will assume an active role as it thinks the primary went down, so now both are active and no one is happy

L4 Transporter

Re: Failover Behaviors

Very good and useful info.

MP
L1 Bithead

Re: Failover Behaviors

Hi,

 

The active firewall went into non-functional state, so the passive firewall took over as active.


xxxx@xxxxxx-fw(passive)> show high-availability state

Group 1:
Mode: Active-Passive
Local Information:
Version: 1
Mode: Active-Passive
State: passive (last 17 hours)
Last non-functional state reason: Dataplane down: path monitor failure.

 

Some related logs on the ha_agent.log:

2019-12-04 09:41:04.464 +0000 debug: ha_slot_sysd_dp_down_notify_cb(src/ha_slot.c:641): Got initial dataplane down (slot 1; reason path monitor failure)
2019-12-04 09:41:04.464 +0000 The dataplane is going down
2019-12-04 09:41:04.464 +0000 Warning: ha_event_log(src/ha_event.c:47): HA Group 1: Dataplane is down: path monitor failure
2019-12-04 09:41:04.464 +0000 Going to non-functional for reason Dataplane down: path monitor failure
2019-12-04 09:41:04.464 +0000 debug: ha_state_transition(src/ha_state.c:1329): Group 1: transition to state Non-Functional
2019-12-04 09:41:04.464 +0000 debug: ha_state_start_monitor_holdup(src/ha_state.c:2518): Skipping monitor holdup for group 1
2019-12-04 09:41:04.464 +0000 debug: ha_state_monitor_holdup_callback(src/ha_state.c:2611): Going to Non-Functional state state
2019-12-04 09:41:04.464 +0000 debug: ha_state_move(src/ha_state.c:1423): Group 1: moving from state Active to Non-Functional
2019-12-04 09:41:04.464 +0000 Warning: ha_event_log(src/ha_event.c:47): HA Group 1: Moved from state Active to state Non-Functional
2019-12-04 09:41:04.464 +0000 debug: ha_sysd_dev_state_update(src/ha_sysd.c:1434): Set dev state to Non-Functional
2019-12-04 09:41:04.464 +0000 debug: ha_sysd_dev_alarm_update(src/ha_sysd.c:1400): Set dev alarm to on

 

 

L4 Transporter

Re: Failover Behaviors

 

I also did a test 

 

Active Passive  PA

 

Only HA1 is connected and no HA1 backup connected.

 

Heartbeat backup is checked on Both Firewalls.

 

Disconnected the HA1 and  Dashboard shows both HA1 and heartbeat are down.

Both PA became active.

 

Need to know even though heartbeat backup is checked and management interface on both PA is up why  heartbeat backup  show down on both firewalls?

Is this expected behaviour?

MP
L7 Applicator

Re: Failover Behaviors

@Jonathan_Panes,

Losing HA2 and having a device go into a non-function status is certainty not expected behavior. There are however multiple HA fixes that have been made in 7.1 in later maintenance releases, so you could possibly be running into a bug. While I generally don't like recommending someone upgrade unless I can point towards a specific issue ID, you are running an older maintenance release that has open security advisories present, so I'm going to use those instead and recommend you upgrade to 7.1.25 which will hopefully fix the issue you ran into here as well as patching some security issues. 

 

PAN-SA-2019-0013

PAN-SA-2019-0019

PAN-SA-2019-0021

PAN-SA-2019-0022

https://securityadvisories.paloaltonetworks.com/

L7 Applicator

Re: Failover Behaviors

@MP18,

How do you have your MGMT traffic routing. It's possible that due to the split-brain scenario present when HA1 is removed the two devices actually can't send heartbeat traffic to each other due to routing issues present when both firewalls are active. We would need to look at your actual network design to verify to be certain, but that would be my first guess. 

L4 Transporter

Re: Failover Behaviors

Hi BPry,

 

We are running 8.1.9 on this PA 3020.

These are our LAB firewalls and they do not have any traffic passing via Data plane.

 

Management Plane routing both firewalls are in same subnet.

All the service Routing is via Management plane only.

 

Regards

Mike

 

 

 

MP
L7 Applicator

Re: Failover Behaviors


@MP18 wrote:

 

Disconnected the HA1 and  Dashboard shows both HA1 and heartbeat are down.

Both PA became active.

 


that's not how it's supposed to work

L4 Transporter

Re: Failover Behaviors

I am running 8.1.9 on PA 3020.

Am i hitting the bug?

MP
Like what you see?

Show your appreciation!

Click Like if a post is helpful to you or if you just want to show your support.

Click Accept as Solution to acknowledge that the answer to your question has been provided.

The button appears next to the replies on topics you’ve started. The member who gave the solution and all future visitors to this topic will appreciate it!

These simple actions take just seconds of your time, but go a long way in showing appreciation for community members and the Live Community as a whole!

The Live Community thanks you for your participation!