Documentation says we should measure CPS for creating baseline. I have done this for last 20 days collecting CLI output every 3 seconds and have Panorama data to back it up. Below is last 7 days data, CPS never peaked beyond 20K and on average is below 4K.
Using Syn-Cookie Max Threshold is 35K and activate as Zero. So it should start dropping packet after CPS reaches 35K. We have issue where randomly users had issue because zone flood protection would kick in, although CPS never went beyond 35K.
To mitigate effecting users I have increased max to 65K and activate to 4.5K, and haven't seen the issue since
Support says flow_dos_syncookie_blk_dur and flow_dos_syncookie_max and the SYN Cookie reached Max Threshold during the time of outage. And that CPS is not reaching the max because during that time the sessions are not getting created in the session table. But these are global counters and apply to all zones and not for the single zone I want to protect.
Ok..I can tell by checking logs which zone might be the cause of it at that time but then CPS is not the right counter to measure as some other counter is increasing and taking us down. Also what is the point of measuring global counters when we need to configure settings for a single zone.
All of our zones use syn-cookie and have separate profiles. Am I missing something here.
no you're right, cps is a pretty inacurate counter when applied as an umbrella measuring tool to decide SYN cookie tresholds
first, its a global counter so your view will be skewed, secondly it only measures allowed connections per second: all the garbage flooding in from the internet is not going to create sessions so is discarded immediately rather than put in the connection queue
a better approach would be to actually enable SYN cookies but set the ceiling incredibly high, then monitor the rate for flow_dos_syncookie_cookie_sent (show counter global filter aspect dos) and you'll know exactly how many cookies you send out globally (if it's just for the one zone, you know what you need to know)
if you have multiple zones already set to send cookies, build a packet-diag filter for the appropriate ingress interface, and filter the global counters
debug dataplane packet-diag set filter match ingress-interface ethernet1/1
debug dataplane packet-diag set filter on
show counter global filter delta yes packet-filter yes aspect dos
@reaper Thanks for the explanation, I did not receive a notification for your response but I followed your advise when I saw this few days ago. I had started logging this every 2 seconds few days ago, and had to stop-start script as it was creating a large log file.
So we got hit by it again and this time, I did not time stamp my logs but atleast 1 of them did get logged.
I increased the threshold to 75K from 35K. I was not expecting same issue again but did get hit twice in a single night.
This is how it looks like, horizontal values are rows of excel and its about 24hrs of data for flow_dos_syncookie_cookie_sent.
Can I say based of this we have had a DDoS.
As you see it rarely breaches 10K, also for both times there is only TCP flood drop log and no syn cookie sent log when it happened, and alarm threshold is set to 5500. We do not have a IDP/S in front of the firewalls.
Also should I not see an increase in Packets/PPS during that time 5:30 and 3:09.
Whether or not you can say you ran into an actual DDoS attack really depends on if you know you saw any abnormal traffic or not. The problem with only having this at a zone level is that you need to really look through the logs and figure out exactly what was going on, and if there could have been a reason you hit those limits. Without knowing your services or what the traffic looks like it would be almost impossible for someone to say for sure what you were seeing was any sort of attack. That big of a random spike could be someone attempting to scan you, or it could be someone forcing a full index through one of the search engines in an SEO optimization attempt. It could just as easily be a process that only kicks off once a month and you simply haven't seen it before.
I'd really recommend building out DoS Protection profile for any service you provide to the outside and setting up both classified and aggregate protection. When setup and configured properly that should help prevent actual Zone Protection events from triggering, but it also gives you more information into what public resources were actually being targeted (Email services, Websites, ect.).
Do you have any feedback on the command:
"show counter interface | match cps"
If run consistently enough does that show you the actual connections per second for that interface? Should you also be dividing that number by "2" as stated in their documentation to see the true number?
Click Accept as Solution to acknowledge that the answer to your question has been provided.
The button appears next to the replies on topics you’ve started. The member who gave the solution and all future visitors to this topic will appreciate it!
These simple actions take just seconds of your time, but go a long way in showing appreciation for community members and the LIVEcommunity as a whole!
The LIVEcommunity thanks you for your participation!