08-25-2014 10:49 PM
Hi,
I have a pair of PAN-VM in active/passive mode and configured link group monitoring with four member ports and when I disconnect one of the ports from vSphere the failover happens quickly and marks the node as "non-functional (Link down)" but when I connect back the port the status does not change and failback not happening unless I remove the HA link group from the passive node. Any idea what may be wrong?
I am using version 6.0.4
Thanks,
Saeed
08-27-2014 01:21 AM
Hello Saeed,
As per my observation: The Active device (A--IP-10.101.200.70) is configured with Priority 50 and Passive device(B--IP-10.101.200.71) configured with priority 100. Preemtion has enabled on both firewalls.
Hence, once firewall B will become active and the monitored link (SP-IF-MON) came UP on firewall A, the FW A should automatically become Active without any manual intervention.
A--IP-10.101.200.70
B--IP-10.101.200.71
Logs from firewall A:
2014-08-26 16:50:12 2014-08-26 16:50:12.118 +1000 debug: ha_sysd_linkmon_link_change(src/ha_sysd.c:3916): Link 1/3 up
2014-08-26 16:50:12 2014-08-26 16:50:12.118 +1000 Group 1: Link 'ethernet1/3' in link group 'SP-IF-MON' state is going from down to up
2014-08-26 16:50:12 2014-08-26 16:50:12.118 +1000 debug: ha_sysd_linkmon_link_change(src/ha_sysd.c:3916): Link 1/2 up
2014-08-26 16:50:12 2014-08-26 16:50:12.118 +1000 Group 1: Link 'ethernet1/2' in link group 'SP-IF-MON' state is going from down to up
2014-08-26 16:50:12 2014-08-26 16:50:12.119 +1000 debug: ha_sysd_linkmon_link_change(src/ha_sysd.c:3916): Link 1/4 up
2014-08-26 16:50:12 2014-08-26 16:50:12.119 +1000 Group 1: Link 'ethernet1/4' in link group 'SP-IF-MON' state is going from down to up>>>>>>>>>>>>>>> link group came UP on firewall A
As per expectation, the FW A became Active:
2014-08-26 16:51:21 2014-08-26 16:51:21.411 +1000 debug: ha_state_transition(src/ha_state.c:1301): Group 1: transition to state Active
2014-08-26 16:51:21 2014-08-26 16:51:21.411 +1000 debug: ha_state_move(src/ha_state.c:1386): Group 1: moving from state Active to Active >>>>>>>>>>>>>>>> Going to Active state
But, at the same time we observed that HA-1 link became DOWN and the monitor interface went down again :
2014-08-26 16:51:21 2014-08-26 16:51:21.411 +1000 Group 1 (HA1-MAIN): Starting hello with timeout: 8s/0ns
2014-08-26 16:51:21 2014-08-26 16:51:21.411 +1000 debug: ha_peer_start_hello(src/ha_peer.c:1064): Group 1 (HA1-BKUP): can't start hello, no connection
2014-08-26 16:51:21 2014-08-26 16:51:21.411 +1000 debug: ha_peer_start_hello(src/ha_peer.c:1064): Group 1 (HA1-MGMT): can't start hello, no connection
2014-08-26 16:53:07 2014-08-26 16:53:07.730 +1000 debug: ha_sysd_linkmon_link_change(src/ha_sysd.c:3916): Link 1/4 down
2014-08-26 16:53:07 2014-08-26 16:53:07.730 +1000 Group 1: Link 'ethernet1/4' in link group 'SP-IF-MON' state is going from up to down
2014-08-26 16:53:07 2014-08-26 16:53:07.730 +1000 Warning: ha_event_log(src/ha_event.c:47): HA Group 1: Link group 'SP-IF-MON' link 'ethernet1/4' is down
2014-08-26 16:53:07 2014-08-26 16:53:07.731 +1000 Warning: ha_event_log(src/ha_event.c:47): HA Group 1: Link group 'SP-IF-MON' failure; one or more links are down >>>>>>>>>>> Link DOWN
2014-08-26 16:53:07 2014-08-26 16:53:07.731 +1000 debug: ha_state_transition(src/ha_state.c:1301): Group 1: transition to state Non-Functional >>>>>>>>>>>>> The firewall went into non-functional state.
Suggestion: According to the current HA configuration, you have set failure condition as "any". Could you please change it to "all" and perform the same test.
Failure condition: any >>>>>>>>>>
Group SP-IF-MON:
Hope this helps.
Thanks
08-27-2014 04:04 PM
Hi Hulk,
Thanks for your time to analyse the logs. Once the active firewall goes into non-functional mode it will not negotiate any HA with its peer to become active even if network has restored from failure unless I disable the SP-IF-MON monitor and my requirement is that even if a single link fails just failover to the other peer and this is to address an incident that I recently had and setting it to "all" will have no value in my scenario .
Cheers,
Saeed
08-27-2014 04:07 PM
Hi,
I am using the default 1 min and in one of the tests I waited 30 minutes and no change! the only cure is disable monitor.
Regards,
Saeed
Click Accept as Solution to acknowledge that the answer to your question has been provided.
The button appears next to the replies on topics you’ve started. The member who gave the solution and all future visitors to this topic will appreciate it!
These simple actions take just seconds of your time, but go a long way in showing appreciation for community members and the LIVEcommunity as a whole!
The LIVEcommunity thanks you for your participation!