User-ID issues with multiple domain controllers

cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Announcements
Please sign in to see details of an important advisory in our Customer Advisories area.

User-ID issues with multiple domain controllers

L2 Linker

Hi,

I have a few questions about how the user-id works that I have been unable to solve.

We are currently rolling out a lot of virtual systems to our customers in a MSSP environment and as you can imagine coming across some strange server setups.  This has resulted in some strange behaviour with user-id setups.

I am trying to work out how user-id behaves in a setup where there are 2x2003 and a single 2008 domain controller.  I have installed the agent on all servers for redundancy and all agents are pointing at all the DCs.  I have setup the User-ID agents in the PA device and they are showing up as connected. The user-id agents all have the same settings configured.

This is now where I run into trouble and am curious to know how the process works.

From what I understand the User-ID agent is meant to sit on any server and query all the domain controllers and read the security event logs to generate the ip address -> username mapping.

Is there any reason why an agent sitting on a 2008 domain controller would have trouble reading logs on a 2003 domain controller and vice versa?  From my understanding the agents are not aware of each other just aware of the other domain controllers that have been configured in the settings.

Why do the different agents sitting on different server versions return different results on the mapping?  The 2x2003 server agents are showing the same mapping however the 2008 server is showing a different set of mapping results...  This is very confusing.

How does the PA work out which agent to be receiving the mapping from?  Does it just look at the first agent in the list on the PA or does it combine the mappings from all the agents for that virtual system and use that. If there are 2+ agents with conflicting results how does it determine which one to use?

If anyone could shed some light on this it would be greatly appreciated.

Thanks

1 REPLY 1

L6 Presenter

The regular method of using dedicated userid/pan-agent servers will most likely not scale that well for enterprise use. You might consider install the userid/pan-agent service on each DC and lock it to only follow the security log from localhost.

That is because the userid/ip mapping is done by tailing/following the security log (events 672, 673 and 674 for Win2003 DCs and events 4768, 4769 and 4770 for Win2008 DCs) and when a user logins to the domain only the DC who authed the user will log its presence (the security log is not replicated within the domain for some odd reason).

With a network of just 2 DCs this is not a problem (even if you have 10.000 concurrent users). But if you have lets say have 50 DCs (with 10.000 concurrent users) the math will be something like (with 2 PA devices):

- Dedicated userid/pan-agent servers (tailing security log of each DC over the network):

50 * 5Mbit/s * 2 + 2 * 0-100kbit/s ~= 500Mbit/s network load (5Mbit/s is estimated network load to follow a single DC in an environment with approx 10.000 concurrent users).

vs.

- Userid/pan-agent installed on each DC (and configured to only read the local security log):

50 * 0-100kbit/s * 2 ~= 0-10Mbit/s network load (that is each PA device (2 of them in this example) will connect directly to each DC (who runs the userid/pan-agent service)).

Having that said there are a number of quirks that one must be able to handle. One is how large TTL/cache timeout would you use? Too small will raise the bandwidth needed for the userid stuff (along with cpu usage of mgmtplane) while too large will end up with wrong user might be logged. For example if user1 logins, then unplugs their laptop and then user2 (who already is logged in) plugs their laptop into the network and get the same ip as user1 (just an example). If you have a TTL of 1 hour then the log in PA might display the incorrect user for up to 1 hour.

So the fix here is obviously to choose as small TTL/cache timeout as possible.

Another fix to better track users is to enable WMI lookups along with server logs inspection (for example inspect logs of ms exchange server) - this way you will have more sources to have your user-ip mapping better updated.

Another quirk which comes from how AD works is that there doesnt seem to be any logout event, basically because users can hibernate and all sort of things (or just unplug their box). A workaround for this is to use the XML api of PA and perhaps write your own syslog tail script that could inspect arp logs from switches and other stuff in order to invalidate specific ip addresses.

The above is to track users who are logged in to a AD.

Another tracking method is the ts-agent. However ts-agent is meant to be used on citrix servers and works by informing the PA box which user is currently logged in and which srcport range is associated with this client. When doing stuff from a citrix server each client gets a dedicated srcport range which gives that the ts-agent have a better "hitrate" than the AD-tracking pan-agent but at the same time they are meant for different purposes.

I think globalprotect (without the VPN stuff) can also be used to also inform your PA devices who is logged in at a specific srcip.

  • 3808 Views
  • 1 replies
  • 0 Likes
Like what you see?

Show your appreciation!

Click Like if a post is helpful to you or if you just want to show your support.

Click Accept as Solution to acknowledge that the answer to your question has been provided.

The button appears next to the replies on topics you’ve started. The member who gave the solution and all future visitors to this topic will appreciate it!

These simple actions take just seconds of your time, but go a long way in showing appreciation for community members and the LIVEcommunity as a whole!

The LIVEcommunity thanks you for your participation!