Trying to gather some hints/tips to provide some stability to our User Identification infrastructure. Currently we are running 2 x User ID Agents ( 4.1.6-5 ) on two Virtuals devices. After we upgrade 2 months back to this version we looked to have a stable setup but in the last 2 weeks we have to daily start the agents as they both stop working.
Each agent connects to the same 14 firewalls ( all 5060's ), some in Active/Standby mode. Agent 1 is primary and Agent 2 is setup as a secondary agent on all firewalls. Each agent talks to around 30-40 DCs, with around 15,000 users.
Any thoughts on what could be improved in this setup ?
Though I tend to avoid upgrading without knowing root cause, your environment is big enough that I would recommend that here.
From the release notes of 4.1.7-2 User-ID Agent (specifically the first and third bugs).
Addressed Issues 4.1.7
The following issues have been addressed in this release:
• 46405 – The User-ID agent on a Windows 2008 server was intermittently failing to respond when the directory contained 50,000+ users, causing valid user to IP mapping information to be deleted on the firewall. This occurred when the session limit of the firewall was being reached. Issue was due to a buffer problem that occurred when trying to write the user to IP mapping to the firewall.
• 45899 – User-ID mapping information was being dropped for Windows clients who stayed logged in for an extended period of time. This occurred intermittently when WMI probing was used.
• 43865 – User-ID agent was causing high CPU and memory utilization on the Active Directory server and eventually stopped updating user to IP mapping of clients. Issue was due to a memory leak with the User-ID agent that occurred when a large number of WMI queries were performed on 7000 plus users.
You could definitely be hitting one or more of these, so it might be worth upgrading at least one of the pair of virtual devices to see if there is any improvement.
You may want to check to see what the memory looks like on these virtual systems too, just to see if it's likely you're running into these.
Hope this helps,
I was reading the notes for 4.1.7 and was planning on doing that in a week or so after my company wide change freeze is over. Is there any other suggestions you can put forward ?
I had the same instability issues with 4.1.6 and 4.1.7. The only solution was to install 5.0.3-4 (recommended by palo support) and now it works flawlessly. They warned me that this was not an officially supported solution as all my firewalls are running 4.1.x.
Sounds interesting to use version 5.x agents on 4.1 devices and it works. I have never had to much luck with backward/forward compatibility with different parts of the PA platform ( talking around the threat management space there ). Will update one agent and see how that goes.
Ok, created 2 new servers running the latest agents and manually set them up for equal load. Servers running the older version 4 User agent are still having issues and version 5 servers are stable.
Upgrading the original servers to User agent 5.x tonight. Thanks for the info Alex, worked well.
Click Accept as Solution to acknowledge that the answer to your question has been provided.
The button appears next to the replies on topics you’ve started. The member who gave the solution and all future visitors to this topic will appreciate it!
These simple actions take just seconds of your time, but go a long way in showing appreciation for community members and the Live Community as a whole!
The Live Community thanks you for your participation!