HA Failover Hold Timers?

OMatlock · ‎05-05-2017

Hi folks,

I will be configuing my first Active/Passive HA next weekend on two PA 3020 devices.

I am trying to understand the difference between Monitor Hold Time for HA1 link and Monitor Fail Hold Down Time for the Active/Passive settings.

Could anyone equate these settings to what would happen in a typical Active/Passive failover?

Ex. Active fails, Monitor Hold Time on Passive will wait 3 seconds then declare a failure, then Monitor Fail Hold Down time on Passive will wait one minute to execute failover and become Active? Does that sound right?

Definitions:

Monitor Hold Time (ms)—Enter the length of time (milliseconds) that the firewall will wait before declaring a peer failure due to a control link failure (1000-60000 ms, default 3000 ms). This option monitors the physical link status of the HA1 port(s).

Monitor Fail Hold Down Time (min) —This value between 1-60 minutes determines the interval in which a firewall will be in a non-functional state before becoming passive. This timer is used when there are missed heartbeats or hello messages due to a link or path monitoring failure.

TranceforLife · ‎05-08-2017

This is a very nice document:

https://live.paloaltonetworks.com/twzvq79624/attachments/twzvq79624/documentation_tkb/543/2/HA_Failo...

View solution in original post

acc6d0b3610eec313831f7900fdbd235 · ‎05-05-2017

Hi @OMatlock

Control Link (HA1) Monitor Hold Time

To monitor the health of the Primary HA1 interface, an additional “Monitor Hold Time” timer is used to detect a failed Primary HA1 condition. If three heartbeats or hello messages are missed between the HA devices, the HA1 Monitor Hold Time will be consulted to determine the amount of time the HA device should wait before declaring a failed Primary HA 1 connection. The default is 3000 ms.

Once a failed Primary HA1 condition has occurred, the units will log the appropriate information into the system logs and failover to the Backup HA1 or Management interface—depending on how the HA1 backup is configured.

Recommendation: If you have a Backup HA1 interface configured, lowering this value will allow a faster failover to the backup HA1 links. Leaving the value at the default of 3000 ms is recommended for most HA implementations. The range for the HA1 Monitor Hold Time is 1000 to 60000 ms. I personally cut this time in half and put 1500 ms instead.

Passive Link State Auto Configuration (A/P)

An important fact to consider when designing an Active/Passive HA architecture is the traffic forwarding links on the passive device defaults to a “Shutdown” state. In the shutdown state, upstream and downstream devices connected to the passive device will not see a valid path until the passive firewall becomes active.

The Passive Link State Auto Configuration feature allows you to bring up the passive device’s traffic forwarding links to reduce the failover time. It does this by bringing the interfaces on the firewall to a “link up” state, but blocks inbound and outbound traffic to the interfaces until the passive unit becomes active. This helps to reduce failover times by eliminating the need to go through port learning and negotiation phases right after a failover to the passive device and can reduce failover times by approximately one to two seconds.

The Passive Link State Auto Configuration setting is enabled under Device > High Availability > Election Settings. The Passive Link State defaults to “Shutdown” and should be set to “Auto” to facilitate faster failover times and to force the link status of the neighboring devices to be in the “link up” state. When the Passive Link State is set to “Auto”, the HA device in the “passive” state will not forward traffic or respond to ARP requests. I like this option, because we are able to avoid the gratuitous ARP delay with up and downstream devices.

I hope it helps.

Willian

TranceforLife · ‎05-08-2017

This is a very nice document:

https://live.paloaltonetworks.com/twzvq79624/attachments/twzvq79624/documentation_tkb/543/2/HA_Failo...

OMatlock · ‎05-08-2017

Thanks folks!

My updated definitions below. Will confirm in my first training class next week.

HA1 Monitor Hold Time (ms) - Amount of time will wait before declaring a HA1 primary link failure and HA1 backup link to take over HA1 duty.

Active/Passive Monitor Fail Hold Down Time (min) - Amount of time an Active firewall will remain active after a path/link failure is detected. If default 1 minute, Active firewall will remain active for 1 minute after a path/link failure is detected, then a failover is executed and Passive firewall now becomes active.

mohammedsalhis · ‎03-15-2023

Just want to point out an important point regarding Monitor Fail Hold Down Time, The following two links will provide detailed explanations.

The one-minute "monitor hold timer" just after failover, is a pre-set timer to prevent unnecessary fail over flaps. After a fail over, the process will not allow another failover if it detects the traffic link down within the one minute timer limit. A link down after the timer expires will subsequently cause a failover.

HA Failover Hold Timers - Knowledge Base - Palo Alto Networks

You can find unnecessary fail over flaps example below.

https://knowledgebase.paloaltonetworks.com/KCSArticleDetail?id=kA10g000000boRqCAI

Thanks

Unlock your full community experience!

HA Failover Hold Timers?

HA Failover Hold Timers?

Show your appreciation!