RADIUS Server failover not working via Authentication Profile

Announcements

ATTENTION Customers, All Partners and Employees: The Customer Support Portal (CSP) will be undergoing maintenance and unavailable on Saturday, November 7, 2020, from 11 am to 11 pm PST. Please read our blog for more information.

Reply
Highlighted
L2 Linker

RADIUS Server failover not working via Authentication Profile

I have two servers listed in my RADIUS Server Profile.

If I shutdown RADIUS on the server that is first in the list I do not see my firewall attempt authentication to the second server.  Authentication fails.

If I completely shutdown the first server in the list I do not see any attempts to authenticate to the second server.  Authentication fails.

I have swapped IPs and have authenticated to the second server, so confirmed routing/password/port to second server is correct.

Checking logs I never see authentication going to the second RADIUS server, so having two servers in the RADIUS Server Profile seems to have absolutly no effect.

 

Anyone test this before and got it working?

 

Note: I do have a support ticket open, going as well as searching through knowledge base articles on RADIUS... (I already did that myself).


Accepted Solutions
Highlighted
L1 Bithead

Managed to solve the issue with PAN TAC support. These were the changes we made (note that we use 2FA which is why the longer than usual time-outs):

 

- We changed the following configuration in the GUI to allow the necessary time to connect:
From FW > Network > Globalprotect > Agent > <agent_config_name> > APP
- Portal connection timeout = 90
- TCP connection timeout = 90
- TCP receive timeout = 90

Changed the GP timeout from the cli:
# set deviceconfig setting global-protect timeout 50

 

Reduced the RADIUS timeout slightly in the GUI:
FW > Device > Server Profile > RADIUS > Timeout = 25, Retries = 1

View solution in original post


All Replies
Highlighted
L7 Applicator

Swapping ip will only test that the password/ shared secret is correct for the first entry.

 

just as a test i would remove server 1 entry and just test server 2 on its own.

 

but of course you may be correct that it never tries server 2 but may be worth a test.

Highlighted
L2 Linker

I have done the test as you suggested, both RADIUS servers work as expected so it doesn't seem like an issue on my side.

 

Not sure what Palo Alto is using to determine if a RADIUS server is not working.

Highlighted
L1 Bithead

Just wondering if you managed to find a solution to your problem? I'm having the same issue...

Highlighted
L2 Linker

No, I opened a ticket but the "solutions" provided didn't fit in with my use case.

 

This is the answer provided to me

 

1. ==> There is a global timeout for global-protect process which is 25 seconds by default. It must be the same as or greater than the total time that any server profile allows for connection attempts. The total time in a server profile is the timeout value multiplied by the number of retries and the number of servers. For example, if a RADIUS server profile specifies a 3-second timeout, 3 retries, and 2 servers, the total time that the profile allows for connection attempts is 18 seconds (3 x 3 x 2).

 

This is just not acceptable when using two factor authentication.

You can adjust via CLI with "set deviceconfig setting global-protect timeout  <range is 3-150 sec>", it is not in any GUI.

 

I also have no idea how this may work with multiple GP gateways... as the whole retry for that is insane.

 

A good idea I was provided is setting authentication protocol in your RADIUS server profile to either CHAP or PAP and not AUTO.  This will cut down on the overall time it takes to retry on a failed server as it will query each host for CHAP and PAP per RADIUS server retry count.

 

Highlighted
L1 Bithead

Thank you for the detailed response!!! I'm in the same boat as you and need a longer timeout to allow for two factor authentication.

 

I figured there must be a CLI command to increase the globalprotect timeout to accomodate the timeout values in the RADIUS server configuration. I will give the "set deviceconfig setting global-protect timeout  <range is 3-150 sec>" command a shot and see if that works for me.

 

I opened up a support case, but I suspect I'll get the same answer as you. Thanks again.

Highlighted
L1 Bithead

Well I made some progress using the "set deviceconfig setting global-protect timeout" command. My 2 RADIUS servers are 35 second timeout, 1 retry so I set the value to 70 seconds. When I log in using the https GP webpage portal, it works perfect. But when I try to log in using the GP software client, I get a 2FA notification as if it's going to work, but then the client shows disconnected.

 

Did you have a similar problem?

Highlighted
L5 Sessionator


@ebonjour wrote:

 

 

1. ==> There is a global timeout for global-protect process which is 25 seconds by default. It must be the same as or greater than the total time that any server profile allows for connection attempts. The total time in a server profile is the timeout value multiplied by the number of retries and the number of servers. For example, if a RADIUS server profile specifies a 3-second timeout, 3 retries, and 2 servers, the total time that the profile allows for connection attempts is 18 seconds (3 x 3 x 2).

 

This is just not acceptable when using two factor authentication.

You can adjust via CLI with "set deviceconfig setting global-protect timeout  <range is 3-150 sec>", it is not in any GUI.

 

I also have no idea how this may work with multiple GP gateways... as the whole retry for that is insane.

 

 

 


This is useful info, thanks for sharing! I was wondering on one occasion why 2nd server in auth profile was never queried when 1st was't working. It was LDAP auth in my case but i'd say the logic is the same.

 

 

Highlighted
L2 Linker

I never even tried it!

Seems just such a poor way to handle timeouts that I just decided it wasn't worth my time... and potentially the time of someone else that had to fix what I broke.  I just hope Palo Alto goes back and tries to address this issue at some point.

Highlighted
L1 Bithead

Managed to solve the issue with PAN TAC support. These were the changes we made (note that we use 2FA which is why the longer than usual time-outs):

 

- We changed the following configuration in the GUI to allow the necessary time to connect:
From FW > Network > Globalprotect > Agent > <agent_config_name> > APP
- Portal connection timeout = 90
- TCP connection timeout = 90
- TCP receive timeout = 90

Changed the GP timeout from the cli:
# set deviceconfig setting global-protect timeout 50

 

Reduced the RADIUS timeout slightly in the GUI:
FW > Device > Server Profile > RADIUS > Timeout = 25, Retries = 1

View solution in original post

Like what you see?

Show your appreciation!

Click Like if a post is helpful to you or if you just want to show your support.

Click Accept as Solution to acknowledge that the answer to your question has been provided.

The button appears next to the replies on topics you’ve started. The member who gave the solution and all future visitors to this topic will appreciate it!

These simple actions take just seconds of your time, but go a long way in showing appreciation for community members and the Live Community as a whole!

The Live Community thanks you for your participation!