It's not the first time that I am facing this kind of issue :
Context : PaloAlto FW with (multiple) userID agents in a single (or multiple) Microsoft domain and user id based security policies.
The User ID feature seems at a glance to be working well, however sometimes UserID seems to "loose focus" on several source IP addresses (users).
For example, at instant t : IP x.x.x.x is identified with user A. Suddently, at instant t + delta t (random) : IP x.x.x.x is no more identified (no more source user). Again after delta t (random) : IP x.x.x.x is identified with the same (or sometimes another) user A
You can see an occurence in the extracted logs below.
We can see IP 10.35.111.103 is identified with source user domain\johnd, then suddently the IP has no more user during around 30 minutes (and appropriate rule is obviously no more matched). And then the IP is associated back with the correct user.
I have seen this behaviour many times with a end customer impact range from "no matter, that seems to be working fine" to "it doesn't work! fix it!", obviously depending of the security policies configuration.
In this particular example, we are running PanOS 7.1.7 on PA-5050 cluster, with UserID agents release 18.104.22.168
Have you ever been facing such issue ?
How is the end user fixing it? I'm thinking the issue could be a possible timeout issue on the user mapping, so if the end user has to re-authenticate then that would fix it.
You could try increasing the timeout value to keep the users logged in and/or enabling server session read. Additionally you could also try enabling WMI probes but be careful with these as a failed probe will remove the user mapping, you need to make sure client computers are set to respond ok to these.
The best user-id method is to have GlobalProtect set up for internal host detection and internal gateways.
hope this helps,
Many thanks for your help and advices.
We are not using WMI probing, because as you underlined it there is some configuration to perform on the endpoints. Assuming that we have more than 5000 hosts on a "not that powerfull network" also we won't risk any network performance impact.
We neither want to deploy Global Protect because we don't want to add more components on the endpoint side (Furthermore we already have another VPN solution with its own components).
I will try to increase the auth cache timeout to a bigger value and I hope it will fix it, keeping in mind the drawback that a "loged out" user will still be identified by the userID if the endpoint still performs network traffic.
FWIW, in my experience, WMI probing has more issues than just making sure the endpoint can reply. I've actually had multiple PA TAC engineers/SE tell me not to use it at all. basically what I noticed early on was that the logs were filled with reports of WMI queue being full so it was no longer queing, which effectively makes it futile.
I would have to agree on not using WMI if you can help it. Its very chartty and all data is sent in clear text so anyone can potentially sniff it. I do agree that increasing the timeout is probably your best bet as I have sen similar behaviour in the past and increaseing the timeout help out. In one case I had to disable the timeout all together because of dropped usernames. Makes it a bit difficult when writing rules around users/ad groups, etc.
Many thanks for you follow-up and advices.
I just made a Change Request to increase the timeout cache value from default 45 min to 120 min.
I will see in a few days whether this has fixed my issue.
Click Accept as Solution to acknowledge that the answer to your question has been provided.
The button appears next to the replies on topics you’ve started. The member who gave the solution and all future visitors to this topic will appreciate it!
These simple actions take just seconds of your time, but go a long way in showing appreciation for community members and the Live Community as a whole!
The Live Community thanks you for your participation!