100% DP CPU utilization!

cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Announcements

100% DP CPU utilization!

L4 Transporter

Hi,

 

I've one client that suddenly started getting high dp utilization, the DP utilization will be at this crazy level during the working hours. I noticed most of the traffic passing through his firewall is SOAP, SSL & Web Browsing with a huge amount of traffic, nearly 5GB per hour for those apps only.

He didn't do any changes to the configurations, he is using DoS policy but I can't see this might affect the DP utilization cause it was there from the begining.

 

Have a look at this crazy rates:

 

> show running resource-monitor minute last 60


Resource monitoring sampling data (per minute):

CPU load (%) during last 60 minutes:
core 0 1 2 3
avg max avg max avg max avg max
* * 100 100 100 100 100 100
* * 100 100 100 100 100 100
* * 100 100 100 100 100 100
* * 100 100 100 100 100 100
* * 100 100 100 100 100 100
* * 100 100 100 100 100 100
* * 100 100 100 100 100 100
* * 100 100 100 100 100 100
* * 100 100 100 100 100 100
* * 100 100 100 100 100 100
* * 100 100 100 100 100 100
* * 100 100 100 100 100 100
* * 100 100 100 100 100 100
* * 100 100 100 100 100 100
* * 100 100 100 100 100 100
* * 100 100 100 100 100 100
* * 100 100 100 100 100 100
* * 100 100 100 100 100 100
* * 100 100 100 100 100 100
* * 100 100 100 100 100 100
* * 100 100 100 100 100 100
* * 100 100 100 100 100 100
* * 100 100 100 100 100 100
* * 100 100 100 100 100 100
* * 100 100 100 100 100 100
* * 100 100 100 100 100 100
* * 100 100 100 100 100 100
* * 100 100 100 100 100 100
* * 88 100 88 100 88 100
* * 100 100 100 100 100 100
* * 100 100 100 100 100 100
* * 100 100 100 100 100 100
* * 100 100 100 100 100 100
* * 100 100 100 100 100 100
* * 100 100 100 100 100 100
* * 100 100 100 100 100 100
* * 100 100 100 100 100 100
* * 100 100 100 100 100 100
* * 100 100 100 100 100 100
* * 100 100 100 100 100 100
* * 100 100 100 100 100 100
* * 100 100 100 100 100 100
* * 100 100 100 100 100 100
* * 100 100 100 100 100 100
* * 100 100 100 100 100 100
* * 100 100 100 100 100 100
* * 100 100 100 100 100 100
* * 100 100 100 100 100 100
* * 100 100 100 100 100 100
* * 100 100 100 100 100 100
* * 100 100 100 100 100 100
* * 100 100 100 100 100 100
* * 100 100 100 100 100 100
* * 100 100 100 100 100 100
* * 100 100 100 100 100 100
* * 100 100 100 100 100 100
* * 100 100 100 100 100 100
* * 100 100 100 100 100 100
* * 100 100 100 100 100 100

Resource utilization (%) during last 60 minutes:
session (average):
7 7 7 7 7 7 7 7 8 7 7 7 7 7 7
7 7 7 7 7 7 7 7 7 7 7 7 7 7 7
7 7 7 7 8 8 7 7 7 7 6 7 7 7 7
7 7 7 7 7 7 7 7 7 7 7 7 7 7 7

session (maximum):
7 7 7 7 8 7 7 8 8 7 7 7 7 8 8
8 8 7 7 7 7 7 7 7 7 7 7 8 7 7
7 7 7 7 9 9 8 7 7 7 7 7 7 7 8
7 7 7 7 7 8 7 7 7 7 8 8 7 7 7

packet buffer (average):
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 1 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

packet buffer (maximum):
1 1 0 1 1 1 0 0 1 0 1 1 1 1 1
1 1 1 1 1 1 1 0 1 1 1 1 1 2 1
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 0 1 1 1 1 1 1 1 1 1

packet descriptor (average):
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

packet descriptor (maximum):
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 1 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

packet descriptor (on-chip) (average):
41 41 34 54 41 37 38 32 44 42 50 43 43 48 41
43 59 69 50 37 47 41 42 41 50 54 52 53 40 46
41 42 41 51 50 48 53 53 53 50 49 44 51 48 46
49 63 54 44 59 42 38 43 44 45 47 46 56 39 43

packet descriptor (on-chip) (maximum):
75 75 53 74 66 81 61 52 67 78 85 73 81 77 61
76 81 87 81 68 73 79 66 78 75 84 78 87 87 83
67 83 72 83 85 74 86 81 83 71 84 77 78 73 68
73 86 89 83 86 62 80 89 77 77 85 76 81 79 72

CPU load (%) during last 15 minutes:
core    0       1       2       3   
     avg max avg max avg max avg max
       *   *  99 100  99 100  99 100
       *   * 100 100 100 100 100 100
       *   * 100 100 100 100 100 100
       *   * 100 100 100 100 100 100
       *   * 100 100 100 100 100 100
       *   * 100 100 100 100 100 100
       *   * 100 100 100 100 100 100
       *   * 100 100 100 100 100 100
       *   * 100 100 100 100 100 100
       *   * 100 100 100 100 100 100
       *   * 100 100 100 100 100 100
       *   * 100 100 100 100 100 100
       *   *  99 100  99 100  99 100
       *   * 100 100 100 100 100 100
       *   * 100 100 100 100 100 100

Resource utilization (%) during last 15 minutes:
session (average):
  7   7   7   7   7   7   7   7   7   7   7   7   7   7   7

session (maximum):
  8   8   8   7   7   7   8   8   7   7   7   7   7   7   7

packet buffer (average):
  0   0   0   0   0   0   0   0   0   0   0   0   0   0   0

packet buffer (maximum):
  1   0   1   1   1   1   1   1   1   1   1   1   1   1   1

packet descriptor (average):
  0   0   0   0   0   0   0   0   0   0   0   0   0   0   0

packet descriptor (maximum):
  0   0   0   0   0   0   0   0   0   0   0   0   0   0   0

packet descriptor (on-chip) (average):
 45  44  52  62  59  62  52  57  49  43  49  60  41  48  50

packet descriptor (on-chip) (maximum):
 83  67  86  86  85  86  83  86  74  81  78  85  83  81  85

 

Resource monitoring sampling data (per hour):

CPU load (%) during last 8 hours:
core    0       1       2       3   
     avg max avg max avg max avg max
       *   * 100 100 100 100 100 100
       *   * 100 100 100 100 100 100
       *   *  99 100  99 100  99 100
       *   *  51  99  51  99  51  99
       *   *  15  55  14  53  14  55
       *   *   9  47   7  17   7  20
       *   *   7  11   6   9   6   9
       *   *   7  11   6  18   5  19

Resource utilization (%) during last 8 hours:
session (average):
  7   7   6   1   1   1   1   1
session (maximum):
  8   9   7   3   1   1   1   1
packet buffer (average):
  0   0   0   0   0   0   0   0
packet buffer (maximum):
  2   1   2   0   0   0   0   0
packet descriptor (average):
  0   0   0   0   0   0   0   0
packet descriptor (maximum):
  1   0   0   0   0   0   0   0
packet descriptor (on-chip) (average):
 46  46  40   3   2   2   2   2
packet descriptor (on-chip) (maximum):
 88  89  87  11   8   6   3   3

 

Session settings is at default:

> show session info

target-dp:                                       *.dp0
--------------------------------------------------------------------------------
Number of sessions supported:                    65534
Number of active sessions:                       4691
Number of active TCP sessions:                   4628
Number of active UDP sessions:                   63
Number of active ICMP sessions:                  0
Number of active BCAST sessions:                 0
Number of active MCAST sessions:                 0
Number of active predict sessions:               2
Session table utilization:                       7%
Number of sessions created since bootup:         7926762
Packet rate:                                     6284/s
Throughput:                                      15814 kbps
New connection establish rate:                   237 cps
--------------------------------------------------------------------------------
Session timeout
  TCP default timeout:                           3600 secs
  TCP session timeout before SYN-ACK received:      5 secs
  TCP session timeout before 3-way handshaking:    10 secs
  TCP half-closed session timeout:                120 secs
  TCP session timeout in TIME_WAIT:                15 secs
  TCP session timeout for unverified RST:          30 secs
  UDP default timeout:                             30 secs
  ICMP default timeout:                             6 secs
  other IP default timeout:                        30 secs
  Captive Portal session timeout:                  30 secs
  Session timeout in discard state:
    TCP: 90 secs, UDP: 60 secs, other IP protocols: 60 secs
--------------------------------------------------------------------------------
Session accelerated aging:                       True
  Accelerated aging threshold:                   80% of utilization
  Scaling factor:                                2 X
--------------------------------------------------------------------------------
Session setup
  TCP - reject non-SYN first packet:             True
  Hardware session offloading:                   True
  IPv6 firewalling:                              True
  Strict TCP/IP checksum:                        True
  ICMP Unreachable Packet Rate:                  200 pps
--------------------------------------------------------------------------------
Application trickling scan parameters:
  Timeout to determine application trickling:    10 secs
  Resource utilization threshold to start scan:  80%
  Scan scaling factor over regular aging:        8
--------------------------------------------------------------------------------
Session behavior when resource limit is reached: drop
--------------------------------------------------------------------------------
Pcap token bucket rate                         : 10485760
--------------------------------------------------------------------------------
Max pending queued mcast packets per session   : 0
--------------------------------------------------------------------------------

> debug dataplane pool statistics
admin@fw-alseef-pa-500> debug dataplane pool statistics
[?1h=

Hardware Pools
[ 0] Packet Buffers            :    57316/57344    0x8000000410000000
[ 1] Work Queue Entries        :   229353/229376   0x8000000417000000
[ 2] Output Buffers            :     1007/1024     0x8000000418c00000
[ 3] DFA Result                :     2045/2048     0x8000000418d00000
[ 4] Timer Buffers             :     4096/4096     0x8000000418f00000
[ 5] PAN_FPA_LWM_POOL          :     1024/1024     0x8000000419300000
[ 6] PAN_FPA_ZIP_POOL          :     1024/1024     0x8000000419340000
[ 7] PAN_FPA_BLAST_PO          :     1024/1024     0x8000000419540000

Software Pools
[ 0] software packet buffer 0  (  512):    16378/16384    0x8000000024821680
[ 1] software packet buffer 1  ( 1024):     8189/8192     0x8000000025031780
[ 2] software packet buffer 2  ( 2048):    16384/16384    0x8000000025839880
[ 3] software packet buffer 3  (33280):     4096/4096     0x8000000027849980
[ 4] software packet buffer 4  (66048):      304/304      0x800000002fa4da80
[ 5] Shared Pool 24            (   24):   166253/170000   0x8000000030d75780
[ 6] Shared Pool 32            (   32):    66917/70000    0x80000000311ffa80
[ 7] Shared Pool 40            (   40):    40000/40000    0x8000000031466f80
[ 8] Shared Pool 192           (  192):   286752/290000   0x8000000031614b80
[ 9] Shared Pool 256           (  256):   139999/140000   0x8000000034c49c00
[10] CTD AV Block              ( 1024):       32/32       0x800000000fef4380
[11] Regex Results             (11544):     2048/2048     0x8000000056d34100
[12] SSH Handshake State       ( 6512):       16/16       0x8000000059f4e680
[13] SSH State                 ( 3200):      128/128      0x8000000059f67f00
[14] TCP host connections      (  176):       15/16       0x8000000059fcc300

 

I couldn't find any root cause for this from my side.

 

Do anyone have a better way to troubleshoot this?

 

Regards,

Sharief

Regards,
Sharief
1 accepted solution

Accepted Solutions

Hi,

 

Issue was escalated to TAC support and after several days of working on this we made an app-override for SOAP and web browsing applications that were consuming our bandwidth and it worked. DP CPU came back to 15%.

 

Regards,
Sharief

Regards,
Sharief

View solution in original post

8 REPLIES 8

L6 Presenter

Did amount of traffic suddenly increase? 

Yes. Client confirmed they have increased traffic due to business requirement.

Regards,
Sharief

In that case check the actual throughput and compare it to your platform limits. You might need a bigger box.

Hi santonic,

 

You mean the defined Throughput in session info I need to compare it to the real life throughput of the client?

 

Regards,

Sharief

Regards,
Sharief

I mean to get an esitmate of all traffic flowing through your FW and compare it to device specification (like PA-500 has 100 Mbps threat prevention throughput). 

Yes, session info will give you current throughput. But to monitor it over longer period of time it would be a good idea to monitor it with some tool like Cacti or whatever monitoring tool ths company is using.

 

Hi,

 

Issue was escalated to TAC support and after several days of working on this we made an app-override for SOAP and web browsing applications that were consuming our bandwidth and it worked. DP CPU came back to 15%.

 

Regards,
Sharief

Regards,
Sharief

Didn't the App override stopped the threat prevention for those applications ??

Yes, but its internal servers and we can trust them.

Still client to server is inspected though.

Regards,
Sharief
  • 1 accepted solution
  • 9038 Views
  • 8 replies
  • 0 Likes
Like what you see?

Show your appreciation!

Click Like if a post is helpful to you or if you just want to show your support.

Click Accept as Solution to acknowledge that the answer to your question has been provided.

The button appears next to the replies on topics you’ve started. The member who gave the solution and all future visitors to this topic will appreciate it!

These simple actions take just seconds of your time, but go a long way in showing appreciation for community members and the LIVEcommunity as a whole!

The LIVEcommunity thanks you for your participation!