Understanding some counters from pow performance during high CPU troubleshooting

cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

Understanding some counters from pow performance during high CPU troubleshooting

L1 Bithead

Hi all,

 

I was troubleshooting one of our customers pa 5220 high CPU utilization based on this KB article:

 

https://knowledgebase.paloaltonetworks.com/KCSArticleDetail?id=kA10g000000CmV2CAK

 

It gives a pretty good explanation of how to interpret these outputs to determine what is utilizing CPU. So I found and calculated the most used processes and here are the results:(The numbers is count * average us) 

 

:appid_match                 |255063764
:policy_lookup                |552187101
:regex_lookup                |1347856444
:sml_vm                         |2699556048
:zip_deflate                    |56840023
:ctd_token                      |2711657233
session_ager                 |144978680
:pbp_buf_latency           |222295817
:mi_aho_sw_offload      |8123266264

Some of these processes are explained in that KB article, but I couldn't find any information about some of the mentioned counters. :mi_aho_sw_offload seems to consume most resources and as I understood from counter it is software offload. Base on the article below, starting from version 9.1 AHO and DFA are done on software by default and it is not recommended to disable it as soon as the TAC recommends doing so (As always some design changes).

 

https://knowledgebase.paloaltonetworks.com/KCSArticleDetail?id=kA10g000000PLsRCAW

 

 But what is the :sml_vm process? It is also consuming a lot of CPU resources based on the calculations. 

 

I hate to troubleshoot high CPU utilization, but if I had more understanding of these processes it would be much easier to determine the next step.                                              

1 REPLY 1

Community Team Member

Hi @FarkhanAliyev ,

 

ctd_token and sml_vm are where the content decoders run. A big chunk of the L7 inspection happens there. If you are seeing very high total_us for these functions proportionately, that's just because these functions have to do quite a bit. You will see these functions on the top on all firewalls where most traffic goes through the L7 inspection. In other words, the increase in total_us of these functions should directly correlate with increase in pkt_recv and ctd_pkt_slowpath. More traffic = More inspection.

 

While it's not recommended to keep it in place, I put in a App Override in place for the flows that were being classified as SSL and web-browsing and CPU dramatically reduced from 99% to 50%.

 

Kind regards,

-Kim.

LIVEcommunity team member, CISSP
Cheers,
Kiwi
Please help out other users and “Accept as Solution” if a post helps solve your problem !

Read more about how and why to accept solutions.
  • 489 Views
  • 1 replies
  • 0 Likes
Like what you see?

Show your appreciation!

Click Like if a post is helpful to you or if you just want to show your support.

Click Accept as Solution to acknowledge that the answer to your question has been provided.

The button appears next to the replies on topics you’ve started. The member who gave the solution and all future visitors to this topic will appreciate it!

These simple actions take just seconds of your time, but go a long way in showing appreciation for community members and the LIVEcommunity as a whole!

The LIVEcommunity thanks you for your participation!