What causes an HA Primary to go into Suspended State?

GNS_Support · ‎02-15-2011

I've setup an HA Pair with Primary Priority of 10 and a secondary of 20 (both with Pre-empt enabled).

The Primary keeps going into Suspended state.. what would cause this?

bpappas · ‎02-15-2011

Without looking at the logs I would just be guessing. Do you see any message in the system logs that would indicate an obvious problem on the primary?

In any event you should probably open a case with support to investigate this issue.

-Benjamin

GNS_Support · ‎02-16-2011

Here's the log entries.. not much to go on!

02/14 12:47:33	ha	critical		state-change	HA Group 1: moved from state Non-Functional to state Suspended
02/14 12:47:33	ha	critical		preempt-loop	HA Group 1: going to suspended state due to detection of a preemption loop after 3 loops

James · ‎02-16-2011

Hi - This link will help describe the scenario:

https://live.paloaltonetworks.com/docs/DOC-1142

Are you seeing this?

Thanks

James

GNS_Support · ‎02-16-2011

Hi James

I did find that link, but what does "non-functional" mean? an interface down, or the whole device going down?

I've also got link monitoring set on the primary, but not the secondary. Would that be causing issues? I thought the primary config would sync to the secondary, but it doesn't.

James · ‎02-16-2011

Hi,

This is a good doc:

https://live.paloaltonetworks.com/docs/DOC-1656

Useful exerts are:

Non-functional: Error sate due to data plane crash or monitor failure

Non-functional loop
A non-functional loop is when both devices in an HA pair have link or path monitoring failures that are not detectible while in non-functional state. This happens when the link state on passive device is set to shutdown in layer 3 mode. The link state on the passive device is always shutdown in vwire and layer2 deployments. If device in HA cluster starts in active state, detects a link or path down and it changes state to non-functional. The peer device at this time will go active. The non- functional device will remain in this state for monitor-fail-holddown time and change state to passive. The active device upon seeing the peer device as passive will change to non-functional because of the link failure. At this point, if monitoring fails again, the device gets into a loop to repeat the active ->non-functional ->passive->active transitions. This state transitions are referred to as flaps. The device will remain in the suspended state even if the link or path connectivity is restored. The default number of flaps is 3. A value of “0” means infinite flaps. The maximum number of flaps defined will have to happen within 15 minutes after which the device enters suspended state. Once the device enters the suspended state, it requires user intervention to transition to functional state. This is accomplished by using the operational command “request high-availability state functional “

Not all parameters are synchronised in HA - HA settings themselves are not synchronised, since some items need to be different on each device.

Thanks

James

randomcamden · ‎02-17-2011

Thanks James.. That doc is spot-on.

Unlock your full community experience!

What causes an HA Primary to go into Suspended State?

What causes an HA Primary to go into Suspended State?

Show your appreciation!