PA2020 High CPU utilization "useridd" 100% management plane

cancel
Showing results for 
Search instead for 
Did you mean: 

PA2020 High CPU utilization "useridd" 100% management plane

L3 Networker

Dears,

My PA2020 has 2 agent working identifying my AD users... but the mgnt plane is running 100% all day long...

Any suggestion ?

pls find below the show resources output....

PA2020 running OS 5.0.2

top - 18:26:05 up 6 days,  1:33,  1 user, load average: 10.26, 11.02, 12.17  <<<<<<<<<<<<<<<< !!!!!

Tasks: 100 total,   2 running,  98 sleeping,   0 stopped,   0 zombie

Cpu(s): 51.9%us, 46.0%sy,  2.1%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st

Mem:    995872k total,   901792k used,    94080k free,     5996k buffers

Swap:  2212876k total,   647316k used,  1565560k free,   179620k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND

2373 root      20   0  209m  72m  63m S  140  7.5  10861:51 useridd<<<<<<<<<<<<<<<<< 140% CPU !!!!

21021 nobody    20   0  429m  51m 4808 S   37  5.3 329:34.26 appweb3

2042 root      30  10  4468  964  792 R    4  0.1   0:00.12 top

2371 root      20   0  651m 210m 4076 S    4 21.6 118:50.34 mgmtsrvr

1720 admin     20   0  4532 1164  912 R    1  0.1   0:02.64 top

2405 root      20   0  355m  89m 2192 S    1  9.2  48:59.31 logrcvr

2142 root      15  -5 39636 2920 1240 S    1  0.3 106:28.41 sysd

2151 root      30  10 40568 3644 1692 S    0  0.4  21:50.38 python

2408 root      20   0  247m 2480 1628 S    0  0.2   5:39.85 varrcvr

2415 root      20   0  141m 2640 1760 S    0  0.3   1:17.82 routed

    1 root      20   0  1836  560  536 S    0  0.1   0:02.30 init

55 REPLIES 55

I can get the management backplane to calm down a bit by restarting useridd via the following command:

debug software restart user-id

Unsure how long it takes for useridd to get angry again after that.

SimasK if I could moderate your post and mark it "+5 Insightful" I would Smiley Happy

Well, you just answered my question - rolling back to 4.1.10 as I type.

> PA: How does this stuff get past the QA process?

More to the point - if it's a known issue which is being reported by lots of people, why do you have to log a fault to get access to the hotfix? Why doesn't PAN just release the hotfix for general distribution with a release note which specifies that it's only to fix the issue listed? This jumping through hoops to get fixes for known, impact-inducing bugs is extremely annoying.

And when I *did* log a case, the first thing I get back from the support partner is "We've escalated it to PAN for release of the hotfix, but why don't you update to 5.0.1 instead"?

And rolling back *again* after installing the "hotfix" 4.1.11-h1 because it bloody breaks the HA sync between my peers.

This is beyond a joke, Palo Alto. Does *nobody* QA these things in all possible environments before release?

Not applicable

Here's something of interest...

When using the USER-ID agent on our DC's, Management CPU was 80-100% all the time.

I've configured it now to agent-less, bumped my cache settings to 90 minutes (our users don't move around that much).

Management CPU is now between 20-40% and Dataplane at 15-25%

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND

31971 root      20   0 12636 2536 2080 S    4  0.3   0:00.13 wmic

2361 root      20   0  346m 100m 2064 S    4 10.4  21:57.54 logrcvr

7033 root      20   0  238m  89m  63m S    1  9.2  47:50.18 useridd

31954 sfsadmin  20   0  4532 1176  912 R    1  0.1   0:00.14 top

31970 root      20   0  3832 1180 1040 S    1  0.1   0:00.02 sh

2108 root      15  -5 54416 4620 1080 S    0  0.5  23:00.03 sysd

2117 root      30  10 39884 3680 1720 S    0  0.4   3:02.84 python

    1 root      20   0  1836  560  536 S    0  0.1   0:02.25 init

This is on version 4, or version 5?

I'm holding off on upgrading to V5 from past experience with Palo Alto's .0 release history! I won't upgrade to it until I hear from the learned denizens of these august forums that it appears stable.

OS Version 5.0.2 on a 2050 PA in Active/Passive HA mode.

The issue seems to be isolated with the 5.0.2 code. People above have gone to 5.0.1, but there are other issues with that build.

No, it's definitely present in 4.1.11 (and the supposed "hotfix" 4.1.11-h1 breaks other things) - and as far as I'm aware you can't do agent-less userID on v4.

Like what you see?

Show your appreciation!

Click Like if a post is helpful to you or if you just want to show your support.

Click Accept as Solution to acknowledge that the answer to your question has been provided.

The button appears next to the replies on topics you’ve started. The member who gave the solution and all future visitors to this topic will appreciate it!

These simple actions take just seconds of your time, but go a long way in showing appreciation for community members and the LIVEcommunity as a whole!

The LIVEcommunity thanks you for your participation!