High dataplane CPU

cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

High dataplane CPU

L2 Linker


Last few days we have been experiencing high dp-cpu on all NPC simultaneously, specifically with flow_ctrl. The flow_ctrl process typically shows 3-10% CPU, but all of a sudden all NPC "DP slot x, dp 0 and 1" simultaneously jump to 30..50...80...100% and maintain that for 30-60 minutes, during which the firewall is basically down. Overall load is very low for PA-7050 (<10%). Clearing all sessions will restore all the core #2 CPU to normal, clearing only ICMP, TCP, UDP or ESP individually doesn't help, no other protocols are active.
"show running resource-monitor ingress-backlogs" and clearing those sessions doesn't restore CPU to normal either.
Full layer-3 directly connected interfaces to network (no L2/ARP/etc. to consider).
PanOS is 9.1.13h3, upgraded to 9.1.14 same issue.
Any suggestions?

1 accepted solution

Accepted Solutions

L2 Linker

This turned out to be an issue with a BGP peer that was suddenly sending us 15K routes which was putting us over the limit of 64K routes on the 7050.  It was very difficult to figure out as there was no indication from any logs with the exception of a log we discovered regarding TCAM full, which pointed us in that direction.  This could have easily been resolved had there been just a single message in the system logs regarding full route table.

View solution in original post

3 REPLIES 3

Cyber Elite
Cyber Elite

if you run `show running resource-monitor` , what do the packet descriptors look like?

If theyre going up to 100 I may have seen a similar issue where a udp flood was consuming all packetbuffers. enabling packet buffer protection and tightening zone protection fixed the issue

Tom Piens
PANgurus - Strata specialist; config reviews, policy optimization

Cyber Elite
Cyber Elite

Hello,

Could be a bug? However a support case could be in order.

Regards,

L2 Linker

This turned out to be an issue with a BGP peer that was suddenly sending us 15K routes which was putting us over the limit of 64K routes on the 7050.  It was very difficult to figure out as there was no indication from any logs with the exception of a log we discovered regarding TCAM full, which pointed us in that direction.  This could have easily been resolved had there been just a single message in the system logs regarding full route table.

  • 1 accepted solution
  • 4182 Views
  • 3 replies
  • 0 Likes
Like what you see?

Show your appreciation!

Click Like if a post is helpful to you or if you just want to show your support.

Click Accept as Solution to acknowledge that the answer to your question has been provided.

The button appears next to the replies on topics you’ve started. The member who gave the solution and all future visitors to this topic will appreciate it!

These simple actions take just seconds of your time, but go a long way in showing appreciation for community members and the LIVEcommunity as a whole!

The LIVEcommunity thanks you for your participation!