RADIUS Server failover not working via Authentication Profile

cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Announcements

Content translations are temporarily unavailable due to site maintenance. We apologize for any inconvenience. Visit our blog to learn more.

RADIUS Server failover not working via Authentication Profile

L2 Linker

I have two servers listed in my RADIUS Server Profile.

If I shutdown RADIUS on the server that is first in the list I do not see my firewall attempt authentication to the second server.  Authentication fails.

If I completely shutdown the first server in the list I do not see any attempts to authenticate to the second server.  Authentication fails.

I have swapped IPs and have authenticated to the second server, so confirmed routing/password/port to second server is correct.

Checking logs I never see authentication going to the second RADIUS server, so having two servers in the RADIUS Server Profile seems to have absolutly no effect.

 

Anyone test this before and got it working?

 

Note: I do have a support ticket open, going as well as searching through knowledge base articles on RADIUS... (I already did that myself).

1 accepted solution

Accepted Solutions

Managed to solve the issue with PAN TAC support. These were the changes we made (note that we use 2FA which is why the longer than usual time-outs):

 

- We changed the following configuration in the GUI to allow the necessary time to connect:
From FW > Network > Globalprotect > Agent > <agent_config_name> > APP
- Portal connection timeout = 90
- TCP connection timeout = 90
- TCP receive timeout = 90

Changed the GP timeout from the cli:
# set deviceconfig setting global-protect timeout 50

 

Reduced the RADIUS timeout slightly in the GUI:
FW > Device > Server Profile > RADIUS > Timeout = 25, Retries = 1

View solution in original post

14 REPLIES 14

L7 Applicator

Swapping ip will only test that the password/ shared secret is correct for the first entry.

 

just as a test i would remove server 1 entry and just test server 2 on its own.

 

but of course you may be correct that it never tries server 2 but may be worth a test.

I have done the test as you suggested, both RADIUS servers work as expected so it doesn't seem like an issue on my side.

 

Not sure what Palo Alto is using to determine if a RADIUS server is not working.

Just wondering if you managed to find a solution to your problem? I'm having the same issue...

No, I opened a ticket but the "solutions" provided didn't fit in with my use case.

 

This is the answer provided to me

 

1. ==> There is a global timeout for global-protect process which is 25 seconds by default. It must be the same as or greater than the total time that any server profile allows for connection attempts. The total time in a server profile is the timeout value multiplied by the number of retries and the number of servers. For example, if a RADIUS server profile specifies a 3-second timeout, 3 retries, and 2 servers, the total time that the profile allows for connection attempts is 18 seconds (3 x 3 x 2).

 

This is just not acceptable when using two factor authentication.

You can adjust via CLI with "set deviceconfig setting global-protect timeout  <range is 3-150 sec>", it is not in any GUI.

 

I also have no idea how this may work with multiple GP gateways... as the whole retry for that is insane.

 

A good idea I was provided is setting authentication protocol in your RADIUS server profile to either CHAP or PAP and not AUTO.  This will cut down on the overall time it takes to retry on a failed server as it will query each host for CHAP and PAP per RADIUS server retry count.

 

Thank you for the detailed response!!! I'm in the same boat as you and need a longer timeout to allow for two factor authentication.

 

I figured there must be a CLI command to increase the globalprotect timeout to accomodate the timeout values in the RADIUS server configuration. I will give the "set deviceconfig setting global-protect timeout  <range is 3-150 sec>" command a shot and see if that works for me.

 

I opened up a support case, but I suspect I'll get the same answer as you. Thanks again.

Well I made some progress using the "set deviceconfig setting global-protect timeout" command. My 2 RADIUS servers are 35 second timeout, 1 retry so I set the value to 70 seconds. When I log in using the https GP webpage portal, it works perfect. But when I try to log in using the GP software client, I get a 2FA notification as if it's going to work, but then the client shows disconnected.

 

Did you have a similar problem?


@ebonjour wrote:

 

 

1. ==> There is a global timeout for global-protect process which is 25 seconds by default. It must be the same as or greater than the total time that any server profile allows for connection attempts. The total time in a server profile is the timeout value multiplied by the number of retries and the number of servers. For example, if a RADIUS server profile specifies a 3-second timeout, 3 retries, and 2 servers, the total time that the profile allows for connection attempts is 18 seconds (3 x 3 x 2).

 

This is just not acceptable when using two factor authentication.

You can adjust via CLI with "set deviceconfig setting global-protect timeout  <range is 3-150 sec>", it is not in any GUI.

 

I also have no idea how this may work with multiple GP gateways... as the whole retry for that is insane.

 

 

 


This is useful info, thanks for sharing! I was wondering on one occasion why 2nd server in auth profile was never queried when 1st was't working. It was LDAP auth in my case but i'd say the logic is the same.

 

 

I never even tried it!

Seems just such a poor way to handle timeouts that I just decided it wasn't worth my time... and potentially the time of someone else that had to fix what I broke.  I just hope Palo Alto goes back and tries to address this issue at some point.

Managed to solve the issue with PAN TAC support. These were the changes we made (note that we use 2FA which is why the longer than usual time-outs):

 

- We changed the following configuration in the GUI to allow the necessary time to connect:
From FW > Network > Globalprotect > Agent > <agent_config_name> > APP
- Portal connection timeout = 90
- TCP connection timeout = 90
- TCP receive timeout = 90

Changed the GP timeout from the cli:
# set deviceconfig setting global-protect timeout 50

 

Reduced the RADIUS timeout slightly in the GUI:
FW > Device > Server Profile > RADIUS > Timeout = 25, Retries = 1

So after 25 seconds does the second RADIUS server get queried?

Yes, it does. The only negative byproduct of this solution is that if both RADIUS servers are available and the user doesn't enter their 2FA within the first 25 seconds, they will receive another 2FA when the 2nd RADIUS server is queried.

 

But otherwise, if RADIUS server 1 is down, after 25 seconds RADIUS server 2 is queried and the user is able to log in.

Thanks for sharing the solution. I was testing RADIUS failover and it worked after changing the timeout and retries.

L2 Linker

I got it to work by adjusting ONLY the TCP receive timeout in global protect, and the CLI based 

# set deviceconfig setting global-protect timeout xx

I have the RADIUS server timeout set to 45 seconds, 1 retry. The global-protect and TCP receive timeout is set to 90 seconds. This is working for a primary and secondary RADIUS server. If I had 3 servers, I would probably need to increase the timeouts further to be 3 x the RADIUS timeout.

 

This is also working for GUI sessions. I haven't been able to get putty sessions to work. I haven't found a way to adjust the putty login timer.

 

L0 Member

I've tested in the manner you suggested and both servers are working just as expected, so it's not something to be causing the issue. I'm not sure of the method Palo Alto is using to check if a RADIUS Server isn't working.

  • 1 accepted solution
  • 21053 Views
  • 14 replies
  • 0 Likes
Like what you see?

Show your appreciation!

Click Like if a post is helpful to you or if you just want to show your support.

Click Accept as Solution to acknowledge that the answer to your question has been provided.

The button appears next to the replies on topics you’ve started. The member who gave the solution and all future visitors to this topic will appreciate it!

These simple actions take just seconds of your time, but go a long way in showing appreciation for community members and the LIVEcommunity as a whole!

The LIVEcommunity thanks you for your participation!