- Access exclusive content
- Connect with peers
- Share your expertise
- Find support resources
03-20-2023 12:12 AM
Hi,
i know this is about old software on old hardware, but both are still supported by Palo Alto. In the last months we get a heavy amount of OOM Message / Stack Traces / you name it.
Actually we arent able to push new config changes from Panorama to the both devices. Everytime the commit will be aborted by some OOM activities and mostly the MGMGSVR oder DEVSVR will be killed.
CPU: 0 PID: 4868 Comm: appweb3 Not tainted 3.10.88-8.1.22.1.58 #1
Hardware name: Intel Tionesta/Tionesta, BIOS 080014 09/21/2011
0000000000000000 ffff88017bda7c90 ffffffff8149786b ffff88017bda7cf0
ffffffff814957d5 0000000000000214 00000000000000a8 0000000000000000
0000000000000202 ffff88017bda7cf0 ffff8801fb386660 00000000000000fb
Call Trace:
[<ffffffff8149786b>] dump_stack+0x19/0x1b
[<ffffffff814957d5>] dump_header.isra.11+0x68/0x199
[<ffffffff810b2c3f>] oom_kill_process+0x62/0x345
[<ffffffff810ec3d6>] ? mem_cgroup_iter+0x1d3/0x1e5
[<ffffffff810eee26>] mem_cgroup_oom_synchronize+0x45b/0x472
[<ffffffff810ee5ed>] ? mem_cgroup_charge_common+0x83/0x83
[<ffffffff810b32e9>] pagefault_out_of_memory+0xe/0x4d
[<ffffffff81494957>] mm_fault_error+0x55/0xfc
[<ffffffff814a5871>] __do_page_fault+0x39d/0x490
[<ffffffff814a12cd>] ? __schedule+0x517/0x75e
[<ffffffff814a596d>] do_page_fault+0x9/0xb
[<ffffffff814a29a8>] page_fault+0x28/0x30
Memory cgroup out of memory: Kill process 9639 (devsrvr) score 251 or sacrifice child
Killed process 8737 (devsrvr) total-vm:1800448kB, anon-rss:1118260kB, file-rss:1 10044kB
snmpd invoked oom-killer: gfp_mask=0xd0, order=0, oom_score_adj=58
CPU: 2 PID: 4728 Comm: snmpd Not tainted 3.10.88-8.1.22.1.58 #1
Hardware name: Intel Tionesta/Tionesta, BIOS 080014 09/21/2011
0000000000000000 ffff88014cb2fc90 ffffffff8149786b ffff88014cb2fcf0
ffffffff814957d5 ffff880207f27400 ffff880207f27400 ffffffffffffff10
0000000000000202 ffff88014cb2fcf0 ffff8801818a3320 00000000000000fb```
Memory cgroup out of memory: Kill process 8008 (mgmtsrvr) score 295 or sacrifice child
Killed process 4321 (mgmtsrvr) total-vm:2326896kB, anon-rss:762820kB, file-rss:828kB
sysdagent invoked oom-killer: gfp_mask=0xd0, order=0, oom_score_adj=0
CPU: 1 PID: 2816 Comm: sysdagent Not tainted 3.10.88-8.1.22.1.58 #1
Hardware name: Intel Tionesta/Tionesta, BIOS 080014 09/21/2011
0000000000000000 ffff88010920bc90 ffffffff8149786b ffff88010920bcf0
ffffffff814957d5 0000000000000010 0000000000000212 ffff88010920bcc0
0000000000000202 ffff88010920bcf0 ffff88009299acc0 0000000000000128
Call Trace:
[<ffffffff8149786b>] dump_stack+0x19/0x1b
[<ffffffff814957d5>] dump_header.isra.11+0x68/0x199
[<ffffffff810b2c3f>] oom_kill_process+0x62/0x345
[<ffffffff810ec3d6>] ? mem_cgroup_iter+0x1d3/0x1e5
[<ffffffff810eee26>] mem_cgroup_oom_synchronize+0x45b/0x472
[<ffffffff810ee5ed>] ? mem_cgroup_charge_common+0x83/0x83
[<ffffffff810b32e9>] pagefault_out_of_memory+0xe/0x4d
[<ffffffff81494957>] mm_fault_error+0x55/0xfc
[<ffffffff814a5871>] __do_page_fault+0x39d/0x490
[<ffffffff81124a05>] ? inotify_read+0x243/0x26a
[<ffffffff814a596d>] do_page_fault+0x9/0xb
[<ffffffff814a29a8>] page_fault+0x28/0x30
Memory cgroup out of memory: Kill process 4351 (mgmtsrvr) score 296 or sacrifice child
As far as i understand the PAN OS documentation, this seems to be a generell Bug in the 8.X and 9.0 trail.
What i want to know (as we get ne appliances within the next 3 months)
-> Could we install more RAM to the Devices (not care about the device warranty)
-> Which PAN-OS is the most stable version in 8.1.X (maybe a TAC recommend version)
Best regards,
Florian
03-20-2023 06:16 AM
8.1.24 is presently the last maintenance release for 8.1 and is currently the recommended version for devices stuck on that release. The PA-500 and M-100 were the only devices I'm aware of with actual upgrade kits. You'd have to open up the box in question to determine if you have available memory slots to upgrade anything, but I'd be more worried about PAN-OS actually utilizing the extra memory once installed. It's quite possible that you'd see absolutely no benefit.
03-22-2023 07:10 AM
You can try scheding restart of the managment server and web processes as a workaround. I have even made automation for this:
Click Accept as Solution to acknowledge that the answer to your question has been provided.
The button appears next to the replies on topics you’ve started. The member who gave the solution and all future visitors to this topic will appreciate it!
These simple actions take just seconds of your time, but go a long way in showing appreciation for community members and the LIVEcommunity as a whole!
The LIVEcommunity thanks you for your participation!