Recently deployed several PA-5250s Running 10.1.3 and there is a issue that randomly comes and goes.
Latency for traffic going through the firewalls spikes to 100-500ms. I was able to capture one thing that looked peculiar and that was flow_fpga_ingress_exception_err counts were high (8169388322) and the rate was high (12468). But I can't seem to find a good definition as what this would indicate.
I also caught the packet descriptor (on-chip) (average): with 100 across the first two rows.
I failed to capture the CPU Cores at the same time though.
You may be able to get more context looking at global counters: show counter global | match fpga
I would ask the same question as you currently are running the preferred/stable version of 10.1.
However, instead of questioning you, I hope they are providing a solution as you seem to already have a case with them.
I am curious to hear what their solution was, if they provide one.
Well right now they have told us that high flow_fpga_ingress_exception_err are expected behavior and not to worry about them. As for the latency, we are just shot-gunning a few changes to see if anything helps. Like reducing port channel down to one link, possibly disabling offloading, and a couple others. Last resort is downgrade to the preferred 9 code. I will let you know if I find anything.
The reason we suggest the downgrade is because we have one 5220 running 9 code and it doesn't experience this issue. That's all we got though.
Any idea if there's any asymmetric routing going on?
Packet capture combined with global counters may shed some light on this. If you manage to narrow this down, to a sample source and destination that would be perfect. Then see if you get a drop pcap and use the pcap filters against the global counters.
I am sorry you are going through this, I am sure you'll find the solution.
At this point, since you are already at the pcap stage, I would perform a packet diagnostics, flow basic and look at a low level what the firewall is doing with each packet/session in the flow logic.
I bet you are familiar with that or already tried it, but if not, below is a good read:
My approach for reading these is different, I get a TSF and find the txt file there and open it in notepad++
We're having a very similar issue on our 5220 (PAN-OS 10.1.4). The latency comes and goes. CPU / Memory usage is close to nothing, same goes for session utilization. However every few secs the flow_fpga_ingress_exception_err counter is rising. Delta says 50 more drops in a second, the next second 3000.
There's a strange thing I noticed. We gather metrics with prometheus (nevermind the software), and monitoring IfHCOutOctets and IfHCinOctets via snmp. We both monitor the firewall interfaces, and the (Cisco) switch ports they're connected to. We're using the same formula for bandwith calculation and get massive differences. On the switchport we see the nightly backups consume the whole 1Gbit bandwith on our graphs, in the same time period the matching firewall interface shows only ~700 Mbit/sec. It's the two ends of the same wire!
I'm not saying it's related to flow_fpga_ingress_exception_err but packet drops (a few thousand per few secs) could explain the difference between the two measured values.
So just a quick update. The issue seamed to be related to the number of sessions we were getting through the firewall. We handle a large number of sessions of a particular protocol requests and it is our number one application by session each day. When we put an App-ID Override on the protocol it appears to have cleared up the latency.
I am now skeptical of Palos session per second capability but if you look at the datasheets they always show the max session/s count using 1 byte http traffic with app-id override. So it is what it is. Good luck out there.
Click Accept as Solution to acknowledge that the answer to your question has been provided.
The button appears next to the replies on topics you’ve started. The member who gave the solution and all future visitors to this topic will appreciate it!
These simple actions take just seconds of your time, but go a long way in showing appreciation for community members and the LIVEcommunity as a whole!
The LIVEcommunity thanks you for your participation!