HA failover when failing a little more?

cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Announcements
Please sign in to see details of an important advisory in our Customer Advisories area.

HA failover when failing a little more?

L1 Bithead

Hello,

 

sorry, if I missed something obvious... but I need your help, because I have no lab environment where I could answer my question by just testing....

 

I have two PA-200 with HA Lite.

 

Both have an outside interface connected to a switch:

Firewall F1 with switch S1, Firewall F2 with switch S2.

S1 and S2 have an interconnection.

 

Now there are two routers (with internet uplinks) connected to the switches, on each side:

Router R1 witch to S1, Router R2 to S2.

 

R1

|

S1-F1

|

S2-F2

|

R2

 

The default internet uplink is via R2.

There's a PBF configured that in case R2 is not reachable, the default route goes via R1.

There's link monitoring on the FW-interfaces connected to the switches.

There's path monitoring to both of the Routers.

 

Taken F2 is the active partner,

Taken the interconnection between S1 und S2 goes down:

 

Active and passive firewall would not see any change in link monitoring.

The passive partner would report that the path to R2 has failed and the device would change to "failed" state (?)

 

The active firewall would see that path to R1 went down and also change to "failed" state (?)

 

But as F1 which sees F2 in "failed" now, is in "failed" itself, I don't think a failover would occur...

Does it?

 

Now R2 also goes down...

IMHO, F2 is failing a little more, because it has lost the path to R2 _and_ R1.

 

If a failover would happen now, the PBF-Tracker would report that the backup-route is available and could forward traffic via R1.

 

How are failover decissions done? Is there documentation?

 

Would my HA pair failover to F1?

 

 

Thank you for your help!

1 accepted solution

Accepted Solutions

Hi

 

in the scenario you proposed the chain of events should look like this:

 

 

---------

 

if the primary firewall has path monitor to both routers and either goes down (r2 fails or r1 fails or the link  between s1 and s2 fails) it will go into a failed state and the secondary member will become active (you can also configure monitors as an AND operation by setting the failure condition to 'all')

 

at that moment the primary is in a passive state and no longer monitoring

 

once the monitoring on the active secondary kicks in, it will notice a failure and after any hold timers that could have been configured expire, it will fail and the primary will take over

 

the scenario repeats: hold timers run out, monitor fails, device will go inactive and the secondary will once again take over

 

this will continue until the 'flap' counter is exceeded at which time the last member to exceed the count is put into a permanent non functional state which can only be recovered by manually activating the device

 

the remaining member will then continue operating to the best of it's capabilities despite the failed monitor

 

----------

 

 

Tom Piens
PANgurus - Strata specialist; config reviews, policy optimization

View solution in original post

4 REPLIES 4

Cyber Elite
Cyber Elite

Hi!

 

first off, in an active-passive configuration, the passive member does not perform path monitoring, so if the path for the passive device dissapears due to the failure in the interconnect, the passive member would not take action. only when path monitoring for the primary member fails, will a failover be initiated making the secondary member master at which point path monitoring is triggered

 

I would recommend you configure path monitor only for the router attached to the firewall, so F1 path monitor R1 and F2 path monitor R2.

 

your primary method of redundancy, however, should be the PBF monitor, that will simply switch routing if the primary path goes down, so there is not a complete HA failover related to a router going down, since the interconnect would allow the active member to simply divert traffic to the other router.

Tom Piens
PANgurus - Strata specialist; config reviews, policy optimization

Hi reaper,

 

thank you for your reply.

 

The information that the passive member does not do path monitoring was very valuable to me.

 

I totally agree with your recommendation.

In my real scenario, I have an additional connection between F1 and R2, so i'm think about putting more priority on F1,

for it will always be my active firewall.

Instead I'm thinking about not having a path-monitoring on the routers at all, because I can't see any value in this any more.

 

 

To answer my own question from the beginning:

F2 was the active partner,

when the interconnection between S1 und S2 goes down,

F2 sees the path to R1 down and sees itself as "failed"?

 

Now F1, which didn't do path monitoring becomes active (is healthy, has all links, sees that F2 reports itself as "failed", doesn't do path-monitoring already...)?

 

Now it does tracking to R1 and R2... But as the interconnection between S1 and S2 is down, it cannot reach R2.

So after some time it will marks itself as "failed"?

 

F2 is still in "failed"... so there's no failover back to F2, right?

 

Now the PBF tracker will find that R1 is available and F1 will forward traffic through it?

 

Is the answer,

"yes, it will actually failover, but already when the interconnection between the switches goes down"?

 

 

Thank you for helping me to understand.

Hi

 

in the scenario you proposed the chain of events should look like this:

 

 

---------

 

if the primary firewall has path monitor to both routers and either goes down (r2 fails or r1 fails or the link  between s1 and s2 fails) it will go into a failed state and the secondary member will become active (you can also configure monitors as an AND operation by setting the failure condition to 'all')

 

at that moment the primary is in a passive state and no longer monitoring

 

once the monitoring on the active secondary kicks in, it will notice a failure and after any hold timers that could have been configured expire, it will fail and the primary will take over

 

the scenario repeats: hold timers run out, monitor fails, device will go inactive and the secondary will once again take over

 

this will continue until the 'flap' counter is exceeded at which time the last member to exceed the count is put into a permanent non functional state which can only be recovered by manually activating the device

 

the remaining member will then continue operating to the best of it's capabilities despite the failed monitor

 

----------

 

 

Tom Piens
PANgurus - Strata specialist; config reviews, policy optimization

 Hi reaper,

 

now I think I got it. Thanks for your valuable hints!

 

This brings me even more to the conclusion that path monitoring is something I don't want in thsi scenario.

  • 1 accepted solution
  • 3259 Views
  • 4 replies
  • 0 Likes
Like what you see?

Show your appreciation!

Click Like if a post is helpful to you or if you just want to show your support.

Click Accept as Solution to acknowledge that the answer to your question has been provided.

The button appears next to the replies on topics you’ve started. The member who gave the solution and all future visitors to this topic will appreciate it!

These simple actions take just seconds of your time, but go a long way in showing appreciation for community members and the LIVEcommunity as a whole!

The LIVEcommunity thanks you for your participation!