- Access exclusive content
- Connect with peers
- Share your expertise
- Find support resources
01-10-2023 03:17 PM - edited 01-10-2023 03:18 PM
This is a recurring issue, a reboot helps for time being.
When attempting to update to the latest antivirus version, we see that the commit fails.
System resources look normal.
And looking at the techsupport file in /var/log/messages, we see that during various attempts:
mgmtsrvr, devsrvr, logrcvr were the killed processes due to out of memory and we see a stack of call traces (crash stack) after every out-of-memory condition.
Attaching the output of messages:
messages |
2023-01-09 09:42:03 |
Jan 9 09:42:03 3000 klogd: Out of memory: Kill process 2714 (devsrvr) score 120 or sacrifice child |
messages |
2023-01-09 09:42:03 |
Jan 9 09:42:03 3000 klogd: Out of memory: Kill process 16347 (mgmtsrvr) score 75 or sacrifice child |
messages | 2023-01-09 09:44:47 |
Jan 9 09:44:47 3000 klogd: Out of memory: Kill process 32707 (logrcvr) score 103 or sacrifice child
|
Looking in auto assistant tool - it points at Memory leak issue.
We tried to restart all three processes and tried to install the AV update but it failed again.
We monitored the resource utilization again and saw that httpd process was consuming around 130% of CPU and fluctuating up and down on a regular basis. Upon restarting the web-server process, httpd consumption went down and we were able to commit the changes and AV install was successful.
We similarly did the same on the secondary device and httpd CPU consumption went down.
Customer running on 9.1.15 (Preferred Release)
Running on PA-3250
(PA-3000 Series PAN-OS 9.1.x is the latest version, and so in our scenario, we are on the latest stable build 9.1.15 for this device.)
We see the following already resolved issues for memory leak, but our customer is on 9.1.15 (Preferred Release)
PAN-175211 Fixed a memory leak issue in the mgmtsrvr process. mgmtsvr process memory leak - 9.0.16, 9.1.13, 10.0.9, 10.1.4
PAN-93839 Linux kernels on PANOS 8.x/9.x have the memory leak which being fixed in the main stream linux - 8.0.10 and 8.1.1
PAN-143485 Fixed a memory leak issue related to a process (*devsrvr*). device server memory leak - 9.0.13,9.1.8,10.0.0
Is this a bug behavior or what else can be done, please advise.
Is it recommended to downgrade to a lower version? If yes, which version?
01-11-2023 01:40 AM
Hi @Param_Upadhyay ,
Memory leak sounds like a definite bug.
For memory leak issue I'd recommend grabbing the TSF and submit it to support for analysis. TAC can confirm if you're hitting a known bug + guide you to the version with the fix (if available).
I can't confirm if you're hitting any of the bugs listed.
Kind regards,
-Kiwi.
01-11-2023 01:49 AM
Better upgrade to the latest 10.2.x version just in case if this bug is solved. Outside of that you may try to find which process causes the issues and if it is not critical to just restart it and maybe add automation with XSOAR or Ansible to trigger the restart each night till TAC finds the root cause.
Memory :
https://knowledgebase.paloaltonetworks.com/KCSArticleDetail?id=kA10g000000ClUb
https://knowledgebase.paloaltonetworks.com/KCSArticleDetail?id=kA14u000000oNDmCAM
Restart process:
https://knowledgebase.paloaltonetworks.com/KCSArticleDetail?id=kA10g000000PLUeCAO
https://knowledgebase.paloaltonetworks.com/KCSArticleDetail?id=kA10g000000ClaGCAS
https://knowledgebase.paloaltonetworks.com/KCSArticleDetail?id=kA10g000000POIHCA4
Ansible or XSOAR to periodically restart the process or the managment plane:
https://paloaltonetworks.github.io/pan-os-ansible/modules/panos_op_module.html
https://xsoar.pan.dev/docs/reference/integrations/panorama
01-11-2023 06:43 AM
Thanks, but we are on PA-3250, and for PA-3000 Series PAN-OS 9.1.x is the latest version, and so in our scenario, we are on the latest stable build 9.1.15 for this device. As advised will try to get TAC involved.
01-12-2023 07:38 AM
Still if support takes too long as workaround you can install the free version of ansible on linux and trigger this task each night with a cronjob or just test the Cortex XSOAR free version as automation is the way to go nowadays.
https://start.paloaltonetworks.com/sign-up-for-community-edition.html
There is also a free trainings:
https://www.youtube.com/watch?v=BhpkZA9t1HA&list=PLD6FJ8WNiIqUVEA2e5LZhmqNnwFcFhDTZ
01-24-2023 02:23 AM
Hi @Param_Upadhyay ,
Just to clarify - PA-3250 is from PA-3200 series, which is the next generation after PA-3000.
You are probably confused by Hardware End-of-Life-Dates - Palo Alto Networks which only list PA-3000 and not PA-3200. That is because end-of-life/sale is not yet announced for PA-3200.
To summarize your PA-3250 can support 10.1+ and you shouldn't have any issues upgrading to 10.1 or higher.
01-24-2023 05:58 AM
Thank you. That really helps.
01-31-2023 08:08 AM
It turns out that "bebug..." can't be accessed by the API 🙂 . So the workaround is the good old expect and ssh:
GNU nano 6.2 expect_palo_alto
#!/usr/bin/expect -f
# Get the commands to run, one per line
set timeout 60
spawn $env(SHELL)
set DEBUG 1
set USER xxxx
set PASS xxxx
set IP_AD xxxx
spawn ssh $USER@$IP_AD
match_max 100000
expect "*?assword:*"
send -- "$PASS\r"
sleep 2
expect "*>*"
send -- "set cli terminal width 200\r"
sleep 2
expect "*>*"
send -- "set cli scripting-mode on\r"
sleep 2
expect "*>*"
send -- "set cli terminal type xterm\r"
sleep 2
expect "*>*"
send -- "debug software restart process web-server\r"
expect "Process*"
sleep 1
02-21-2023 10:53 PM
I am also facing the same problem.
01-09-2024 09:42 AM
We have a PA-3420 cluster running software 11.0.3. We seem to have this happening at least once a week or so. The memory leak causes the Active node to reboot in most cases.
We have been told there is a fix out soon, just waiting for the new release of the software, should have been out yesterday.
06-07-2024 06:52 AM
Hi Hussain,
Hope you are well. We are presenting the same behavior with a PA-3410 Firewall. What version of PAN-OS did you recommend? Has it already been released? Did it solve the problem?
I hope you can help me with these questions, thank you very much.
Click Accept as Solution to acknowledge that the answer to your question has been provided.
The button appears next to the replies on topics you’ve started. The member who gave the solution and all future visitors to this topic will appreciate it!
These simple actions take just seconds of your time, but go a long way in showing appreciation for community members and the LIVEcommunity as a whole!
The LIVEcommunity thanks you for your participation!