Static routing and VPN tunnels failover/monitoring configuration with Dual ISP implementation

cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

Static routing and VPN tunnels failover/monitoring configuration with Dual ISP implementation

L1 Bithead

After upgrading PA-220 from 9.1.18 to 10.2.x previously "healthy" Tunnel and Path monitors for VPN tunnels were up and down, constantly re-keying on the remote end.

 

We managed to solve the re-keying issue (only IPSec was a problem, not the IKE), and removed one of the monitoring solutions - tunnel monitoring as requested by PA TAC, with only Path monitoring left, each of the primary ISP-based tunnels been actively monitored. 

Each remote end has two endpoints for redundancy, so we have 6x tunnels total - two for each geo-location for each ISP provider. 

 

I question whether all six path monitors should have active PATH monitoring enabled or only one of the two for each location—three total—to keep static routes under control. 

WAN2 CLDW1 -> path monitor enabled

WAN2CLDW2 -> path monitor enabled?

WAN2CLDE1 -> path monitor enabled

WAN2CLDE2 -> path monitor enabled?

WAN2CLDC1 -> path monitor enabled

WAN2CLDC2 -> path monitor enabled?

 

I appreciate the constructive input. 

 

 

1 accepted solution

Accepted Solutions

Cyber Elite
Cyber Elite

I think your metric values are too close together as the value can be from 1 to 65,535. For something like this I typically use OSPD and use metrics in the thousands, ie for my preferred route, the default metric, for the secondary route, use metric 10000. This guarantees the proper path is taken. For your situation, perhaps something like:

 

10.80.0.0/12 0.0.0.0 10 A S tunnel.1 - PRI ISP W-endpoint IP1                     This will be used 1st
10.80.0.0/12 0.0.0.0 5000 S tunnel.2 - PRI ISP W-endpoint IP2                     This will be used 3rd
10.80.0.0/12 0.0.0.0 1000 S tunnel.7 - SECOND ISP W-endpoint IP11          This will be used 2nd
10.80.0.0/12 0.0.0.0 10000 S tunnel.8 - SECOND ISP W-endpoint IP 12       This will be used 4th

 

This way traffic from Primary WAN firewall to NET1 will go via tunnel.1

 

on the Net1 firewall so the same thing:

0.0.0.0/0 10 A S tunnel.1 - PRI ISP Pri-WAN                     This will be used 1st
0.0.0.0/0 5000 S tunnel.2 - PRI ISP Pri-WAN2                   This will be used 1st
0.0.0.0/0 1000 S tunnel.7 - SECOND ISP OPT                  This will be used 1st
0.0.0.0/0 10000 S tunnel.8 - SECOND ISP Opt2                This will be used 1st

View solution in original post

14 REPLIES 14

Cyber Elite
Cyber Elite

Hello,

Dont think you need path monitoring on the secondary VPN. If its down and the primary is up, there is nothing to fail over to. However I would add a metric cost to the secondary VPN path so its not desired by the firewall. This way you get failover if primary fails and fail back once its back online.

 

https://knowledgebase.paloaltonetworks.com/KCSArticleDetail?id=kA10g000000PLL8CAO

 

Hope that makes sense.

OtakarKlier, thank you for your comment. Yes, different metrics are used for all routes.

 

WAN2CLD1 metric 10 - Primary ISP

WAN2CLD2 metric 15

------------------------------

OPT2CLD1 metric 20  - Secondary ISP

OPT2CLD2 metric 25 

 

My question is, do I still Path Monitor each route in the tunnel set or only one of them?

 

WAN2CLD1 metric 10 w/Path Monitor to the NET1

WAN2CLD2 metric 15 with or without Path Monitor to the NET1?

Cyber Elite
Cyber Elite

Hello,

For the metrics, increase your numbers for the secondary VPN.

  • Default is 10 so thats your primary, make the secondary like 100 or 200. All the metric is telling the device is that its 'less' preferred. It doesnt slow the traffic down.

Dont path monitor your second path. It doesnt need it since there is nothing to fail over to if it goes down.

 

Also to clarify you have 3 different end points: each with two VPN's with two different ISP's?

 

Regards,

Only Primary ISP connections are Path Monitored.

 

Increasing the metric is a possible option, in my case 

routes to NET1 would look like this?

 

WAN2CLD1 metric 10 - Primary ISP

WAN2CLD2 metric 60

------------------------------

OPT2CLD1 metric 20  - Secondary ISP

OPT2CLD2 metric 70

About the endpoints, 

Each of three endpoints has two redundant VPN tunnels in the cloud, with two ISPs at the remote (PA) end, it makes it six in total 3x pairs for each ISP

Cyber Elite
Cyber Elite

Hello,

Perhaps I am having a hard time to visualize this. Its it something like:

OtakarKlier_0-1716392312271.png

 

more like that

 

AlexanderUsach_1-1716398498111.png

 

Cyber Elite
Cyber Elite

Ah gotcha, so I think I would do something like the following:

Primary WAN - First VPN Tunnel Metric 10, Second VPN tunnel Metric 100

Secondary OPT - First VPN tunnel Metric 200, Secondary VPN tunnel Metric 300

 

Also make sure you have BFD, bi-directional forwarding disabled. Also on the NET1 router, make sure your static routes point back down the correct tunnels. That way you dont get asymetic routing. I think your metrics are too close together.

 

Also just a question, why have two tunnels via the same ISP? Perhaps I am missing more info here.

 

Regards,

However, two tunnels from the Primary ISP interface with different metrics, 10 and 70, show different encap/decap counts. We expect all encap and decap on tunnel1 as it is "AS" in the FIB. We are having issues with end-to-end ICMP probing in this region.

....

10.80.0.0/12 0.0.0.0 10 A S tunnel.1 - PRI ISP W-endpoint IP1
10.80.0.0/12 0.0.0.0 70 S tunnel.2 - PRI ISP W-endpoint IP2
10.80.0.0/12 0.0.0.0 20 S tunnel.7 - SECOND ISP W-endpoint IP11
10.80.0.0/12 0.0.0.0 80 S tunnel.8 - SECOND ISP W-endpoint IP 12

 

show vpn flow tunnel-id 1

tunnel WAN2E1
id: 1
type: IPSec
gateway id: 1
local ip: ccccccc
peer ip: yyyyyyy
inner interface: tunnel.1
outer interface: ethernet1/1
state: active
session: 46407
tunnel mtu: 1427
soft lifetime: 3569
hard lifetime: 3600
lifetime remain: 2385 sec
lifesize remain: N/A
latest rekey: 1215 seconds ago
monitor: off
monitor packets seen: 0
monitor packets reply:0
en/decap context: 615
local spi: F3FF42A4
remote spi: C8C60A2C
key type: auto key
protocol: ESP
auth algorithm: SHA1
enc algorithm: AES128
anti replay check: yes
anti replay window: 1024
copy tos: no
enable gre encap: no
initiator: no
authentication errors: 0
decryption errors: 0
inner packet warnings: 0
replay packets: 0
packets received
when lifetime expired:0
when lifesize expired:0
sending sequence: 9286
receive sequence: 0
encap packets: 1319141
decap packets: 0
encap bytes: 1380852888
decap bytes: 0
key acquire requests: 1
owner state: 0
owner cpuid: s1dp0
ownership: 1


show vpn flow tunnel-id 2

tunnel WAN2E2
id: 2
type: IPSec
gateway id: 2
local ip: ccccccc
peer ip: yyyyyyy
inner interface: tunnel.2
outer interface: ethernet1/1
state: active
session: 45593
tunnel mtu: 1427
soft lifetime: 3575
hard lifetime: 3600
lifetime remain: 1080 sec
lifesize remain: N/A
latest rekey: 2520 seconds ago
monitor: off
monitor packets seen: 0
monitor packets reply:0
en/decap context: 332
local spi: 9FFCBA57
remote spi: C50623FD
key type: auto key
protocol: ESP
auth algorithm: SHA1
enc algorithm: AES128
anti replay check: yes
anti replay window: 1024
copy tos: no
enable gre encap: no
initiator: no
authentication errors: 0
decryption errors: 0
inner packet warnings: 0
replay packets: 0
packets received
when lifetime expired:0
when lifesize expired:0
sending sequence: 839
receive sequence: 2676
encap packets: 58074
decap packets: 273829
encap bytes: 6968880
decap bytes: 32166648
key acquire requests: 1
owner state: 0
owner cpuid: s1dp0
ownership: 1

We do not have controls on the Cloud provider's end. They always make 2x Tunnels for each VPN connection to allow redundancy and flexibility to reset the tunnels at will - move to a different host, etc.- without distracting the VPN connection. 

Cyber Elite
Cyber Elite

I think your metric values are too close together as the value can be from 1 to 65,535. For something like this I typically use OSPD and use metrics in the thousands, ie for my preferred route, the default metric, for the secondary route, use metric 10000. This guarantees the proper path is taken. For your situation, perhaps something like:

 

10.80.0.0/12 0.0.0.0 10 A S tunnel.1 - PRI ISP W-endpoint IP1                     This will be used 1st
10.80.0.0/12 0.0.0.0 5000 S tunnel.2 - PRI ISP W-endpoint IP2                     This will be used 3rd
10.80.0.0/12 0.0.0.0 1000 S tunnel.7 - SECOND ISP W-endpoint IP11          This will be used 2nd
10.80.0.0/12 0.0.0.0 10000 S tunnel.8 - SECOND ISP W-endpoint IP 12       This will be used 4th

 

This way traffic from Primary WAN firewall to NET1 will go via tunnel.1

 

on the Net1 firewall so the same thing:

0.0.0.0/0 10 A S tunnel.1 - PRI ISP Pri-WAN                     This will be used 1st
0.0.0.0/0 5000 S tunnel.2 - PRI ISP Pri-WAN2                   This will be used 1st
0.0.0.0/0 1000 S tunnel.7 - SECOND ISP OPT                  This will be used 1st
0.0.0.0/0 10000 S tunnel.8 - SECOND ISP Opt2                This will be used 1st

we do not use PBF or advanced routing engines... so I can't disable bi-directional forwarding? TY!

Thanks a lot, OtakarKlier, for constructive feedback. 

Cyber Elite
Cyber Elite

Hello,

I hope this works for you as the this still might cause asymmetric routing, eg the cloud provider sending traffic down the incorrect tunnel. When I do this, I utilize Policy Based Forwarding, however since you have two sites, I'm not sure how your WAN and OPT devices share routing info and this can also cause issues.

 

Please let us know if this worked or if you have additional questions.

 

Cheers!

  • 1 accepted solution
  • 1985 Views
  • 14 replies
  • 0 Likes
  • 101 Subscriptions
Like what you see?

Show your appreciation!

Click Like if a post is helpful to you or if you just want to show your support.

Click Accept as Solution to acknowledge that the answer to your question has been provided.

The button appears next to the replies on topics you’ve started. The member who gave the solution and all future visitors to this topic will appreciate it!

These simple actions take just seconds of your time, but go a long way in showing appreciation for community members and the LIVEcommunity as a whole!

The LIVEcommunity thanks you for your participation!