Query on Path monitoring

Announcements

ATTENTION Customers, All Partners and Employees: The Customer Support Portal (CSP) will be undergoing maintenance and unavailable on Saturday, November 7, 2020, from 11 am to 11 pm PST. Please read our blog for more information.

Reply
Highlighted
L4 Transporter

Query on Path monitoring

Will Path monitoring kick in if Enable HA is not selected?

One of the KBs mentioned path monitoring failure which cause the loop condition.

 

HA.pngLink and Path Monitoring.png


Accepted Solutions
Highlighted
L4 Transporter

Hi @FarzanaMustafa ,

 

I believe you are confusing HA path monitor with static route path monitor.

https://docs.paloaltonetworks.com/pan-os/8-1/pan-os-admin/networking/static-routes/static-route-remo...

 

The purpose of HA path monitor is to trigger failover, to the other member in the cluster, in case that FW detect issues in the path from the active member. As you can imagine, if you don't have HA enabled...there is no failover, so what would be the purpose to monitor the path at all.

 

Static route path monitor can be configured for each static route. Its purpose is to de-activate the static route in case of issues with that path. This has nothing to do with HA, so if the path is down FW will simply deactivate that static route so the traffic can take next best match in the routing table.

 

P.S now that I re-read your question - I was thinking that you may refer to the failover loop. So what is happening is when you configure HA and enable path monitor the active FW will ping select address in order to detect issues in the path. If the ping is down the FW will think that there is some issues with the path and will failover to the secondary member. As you may know the passive FW in PAN cluster will keep its routing engine "disabled". This means that passive FW is not capable of sending or receiving any traffic over it dataplane interfaces. Which means when FW is in passive state it cannot send ping to test the path. So when secondary member become active only then it start sending ping to test the path. 

 

And here comes your failover loop - if the problem is not in the FW connection, but somewhere down the path, both members in the cluster will not be able to ping the provided ip. Unfortunately each member will discover this only when become active. So you will have
1. Path monitor on primary goes down.

2. Primary failover to secondary

3. Secondary start sending ping for path monitor

4. Path monitor on secondary member goes down (since the problem is at the next hop)

5. Secondary failover back to primary

6. Primary start sending pings for path monitor

7. There is still issues with next hop so path monitor from primary goes down

8. Primary failover to secondary

 

This can keep going on and on. That is why PAN FW has failover loop prevention, which is basically a counter that is counting how many times there was failover for given period of time. When the count reach the configured limit one of the member move to suspended state, that way the currently active member will remain active even if the path monitor is still down.

 

And that is how HA path monitor can cause "loop condition" aka failover loop. However still you need to have HA enable to have failover loop.

View solution in original post

Tags (1)

All Replies
Highlighted
L4 Transporter

Hi @FarzanaMustafa ,

 

I believe you are confusing HA path monitor with static route path monitor.

https://docs.paloaltonetworks.com/pan-os/8-1/pan-os-admin/networking/static-routes/static-route-remo...

 

The purpose of HA path monitor is to trigger failover, to the other member in the cluster, in case that FW detect issues in the path from the active member. As you can imagine, if you don't have HA enabled...there is no failover, so what would be the purpose to monitor the path at all.

 

Static route path monitor can be configured for each static route. Its purpose is to de-activate the static route in case of issues with that path. This has nothing to do with HA, so if the path is down FW will simply deactivate that static route so the traffic can take next best match in the routing table.

 

P.S now that I re-read your question - I was thinking that you may refer to the failover loop. So what is happening is when you configure HA and enable path monitor the active FW will ping select address in order to detect issues in the path. If the ping is down the FW will think that there is some issues with the path and will failover to the secondary member. As you may know the passive FW in PAN cluster will keep its routing engine "disabled". This means that passive FW is not capable of sending or receiving any traffic over it dataplane interfaces. Which means when FW is in passive state it cannot send ping to test the path. So when secondary member become active only then it start sending ping to test the path. 

 

And here comes your failover loop - if the problem is not in the FW connection, but somewhere down the path, both members in the cluster will not be able to ping the provided ip. Unfortunately each member will discover this only when become active. So you will have
1. Path monitor on primary goes down.

2. Primary failover to secondary

3. Secondary start sending ping for path monitor

4. Path monitor on secondary member goes down (since the problem is at the next hop)

5. Secondary failover back to primary

6. Primary start sending pings for path monitor

7. There is still issues with next hop so path monitor from primary goes down

8. Primary failover to secondary

 

This can keep going on and on. That is why PAN FW has failover loop prevention, which is basically a counter that is counting how many times there was failover for given period of time. When the count reach the configured limit one of the member move to suspended state, that way the currently active member will remain active even if the path monitor is still down.

 

And that is how HA path monitor can cause "loop condition" aka failover loop. However still you need to have HA enable to have failover loop.

View solution in original post

Tags (1)
Highlighted
Cyber Elite

@FarzanaMustafa,

@AlexanderAstardzhiev did a great job describing how this actually works, and how PAN prevents failover loops from happening with path monitoring and HA suspension. I just wanted to add that you actually can enable Path-Monitoring or Link-Monitoring on a device which is not HA enabled. What happens in that situation is that a link-monitoring or path-monitoring failure would be logged under your system logs that you could use to potentially notify yourself of the issue. Since you don't have HA, nothing else happens at that point. The device simply generates system logs notifying you of the issue. 

Like what you see?

Show your appreciation!

Click Like if a post is helpful to you or if you just want to show your support.

Click Accept as Solution to acknowledge that the answer to your question has been provided.

The button appears next to the replies on topics you’ve started. The member who gave the solution and all future visitors to this topic will appreciate it!

These simple actions take just seconds of your time, but go a long way in showing appreciation for community members and the Live Community as a whole!

The Live Community thanks you for your participation!