VPN & SSL VPN questions - A/A cluster

Reply
L0 Member

VPN & SSL VPN questions - A/A cluster

I'm starting to setup an Active/Active cluster And I'm looking at using arp-Load-Sharing as that seems to be the fault tolerant/load balancing one.

So here's the question will SSL VPN (web interface) & site2site vpn's (with cisco and sonicwall devices) work with Arp-Load-Sharing, or should I switch to Floating IP. Also we are going to be using these firewalls to replace our proxy server (with content filtering), AD authentication, as well as the full gambit of APP-ID and user-id etc...

And if there's any +'s or -'s to each.

And/or should the arp-Load-Sharing only be used internal and the floating-IP used for external with an ISP? Just trying to figure out the best practice here before going to far.

Message was edited by: Kyle Weir

Not applicable

Re: VPN & SSL VPN questions - A/A cluster

Arp load sharing only provides load sharing within the connected layer 2 domain.  Think carefully about what benefits you expect to achieve in this scenario – benefits of each HA interface type.  If you are uncertain – choose floating.

Are you also performing dynamic routing?  If so, is route table sync disabled?  Again, please think carefully about this scenario as it relates to the tunnels and HA connectivity.

Since you are running active–active, I assume that you have specific failure scenarios in mind.  Be sure that the VPN service configuration will meet these same requirements.

Since you are only starting to setup the cluster, understand that active-active can be much more complex than active-passive.  Advice: choose the simplest cluster method that will achieve your requirements.

L7 Applicator

Re: VPN & SSL VPN questions - A/A cluster

As dill mentions, be sure your use case is one for the Active/Active cluster.  These add a significant complexity to the deployment.  They are NOT about adding capacity.  A cluster is there so primarily so that if one firewall fails traffic will still fully flow.  This means that even in an active/active deploy EACH firewall should be able to carry the full load of traffic.

The primary use case for Active/Active is when the network design requires permitting asymmetrical traffic flow.

The secondary use case is where dynamic routing protocol peers must be maintained through the inactive firewall for the network design failover.

If you don't need these conditions, then deploy in active/passive mode.

And when you open tickets even if you meet these approved use cases, you will spend the first part of every conversation up the support chain justifying the Active/Active deploy is necessary.  So be sure that it is.

With the ssl and site-to-site vpn you do need to use floating ip for the failover and operation to work.  You cannot use arp load sharing and maintain a tunnel end point.

Have a look at the active/active tech note for the details, this is old but still the most complete document on the topic.  Note how the configuration of nat, session handling and session ownership are handled in the Active/Active cluster.

Configuring Active/Active HA PAN-OS 4.0

Steve Puluka BSEET - IP Architect - DQE Communications (Metro Ethernet/ISP)
ACE PanOS 6; ACE PanOS 7; ASE 3.0; PSE 7.0 Foundations & Associate in Platform; Cyber Security; Data Center
L2 Linker

Re: VPN & SSL VPN questions - A/A cluster

I'm also gonna jump in on this.  Been Active/Active for the past 3 - 4 weeks and even leaving Site2Site VPNs alone I'm still fighting with strange issues with tunnels not re-keying properly and dropping for the duration of the key lifetime.  To the best of my knowledge, to make tunnels and portals work across both devices you will need to use Floating IPs.  This will enable the two devices to sync these tunnels and portals.  Although, that still hasn't worked for me with all my devices yet.  Another thing to consider is that you have to position the PAs where they can answer a layer 2 arp request as this is how Floating IPs have to work and something that completely floored me.  If you are using routing protocols on your PAs to bring traffic in from your ISPs, Floating IPs will not work in this case as traffic destined to a peer device will not be forwarded over HA3.  See my post below for more detail.

Active/Active Floating IP/Traffic Forwarding Problem

In addition to document from Steven, here is another document that helped me.

High Availability Synchronization

This is about the extent of the documentation provided.  There are several articles and discussions out there, but beware that Steven is absolutely correct.  You will be fighting this battle much on your own and documentation is scarce.  It took talking to three techs to determine my Floating IP problem only to find out it "just wasn't designed to work that way".  Here is my latest post... seems to be pretty straight forward but PAN won't really discuss this further until I change my HA config is set to always use Primary as session owner and session setup so I've turned to the community for help troubleshooting.

Active/Active & IPSec Trouble

I will add that as far as the normal transit traffic goes, it does work and so far works great.  I'm using OSPF internally and BGP externally and HA3 takes care of any asymmetric packet flows as promised by PAN.

Good luck, and if you can avoid Active/Active... avoid it.

Not applicable

Re: VPN & SSL VPN questions - A/A cluster

What version of code are you having vpn re-key issues with? I have had a bug open for 6months and they put a fix in 6.0.1 for re-key issues on active/active.

L2 Linker

Re: VPN & SSL VPN questions - A/A cluster

Yup, same problem.  We are running 5.0.11.

• 60201—In a HA Active/Active setup, an IPSec key renegotiation timing issue caused the new IPSec session to be set to DISCARD until the next key renegotiation. This caused traffic loss until the next tunnel key renegotiation.


I'm waiting for the 5.0.12 release though.

Highlighted
L0 Member

Re: VPN & SSL VPN questions - A/A cluster

Just as an FYI, after everyone's comments we've decided to go with an Active/passive setup, easier and discovering that we'd need to redo some of the routes on our Switch cores anyway if we lost a datacenter (which really needs to be fixed).. In any case the need for Active/Active was more of a pie in the sky hey can we do that, and it'd be cool. Then the reality of a lot more work, and yes we could do it but there isn't enough of a justification at the moment. I appreciate all of your insight and help.

Like what you see?

Show your appreciation!

Click Like if a post is helpful to you or if you just want to show your support.

Click Accept as Solution to acknowledge that the answer to your question has been provided.

The button appears next to the replies on topics you’ve started. The member who gave the solution and all future visitors to this topic will appreciate it!

These simple actions take just seconds of your time, but go a long way in showing appreciation for community members and the Live Community as a whole!

The Live Community thanks you for your participation!