- Access exclusive content
- Connect with peers
- Share your expertise
- Find support resources
03-14-2018 01:40 PM - edited 03-14-2018 02:07 PM
While trying to track down the cause for 3 recent Internet outages we've experienced at one of our schools (which we still haven't determined the cause to yet), we've noticed that our OSPF adjacencies are flapping up and down across the district. Multiple times per day, across multiple sites, going back to the beginning of last month (that's as far back as the logs go on the district core firewall).
Is this normal or something we should be concerned with? Could this be the reason we get hiccups in our connections to the schools (where you can be typing in an SSH session and suddenly all the characters stop appearing for 10 seconds then appear slowly then appear normally again) when network usage for the school is fairly low? Could this be the reason for 5-10 minute outages like we've experienced the past two days (nothing showing in the logs on the fibre switches, no links up/down, no STP outages, etc)? Could this get to the point where our entire WAN goes down?
I'm very new to OSPF and routing protocols in general, coming from a static routing background dealing only with the connections on the "inside" of the telco router at a remote site (each site with their own connection to the Internet). We've since migrated to a proper WAN setup using OSPF internally, with a single connection to the public Internet for the whole district.
Our WAN consists of 3 separate networks that all terminate at the district office: an MPLS link with the local telco for the out-of-town schools, a point-to-point fibre network in town, and a point-to-point wireless network for schools we can't reach with fibre yet. For the MPLS links, the OSPF is established between an L3 switch in the district office (upstream from the district firewall) and the PA firewall in the school. For the fibre and wireless networks, the OSPF is established between the PA firewall in the district office and the PA firewall in the school (we use a layer 2 vlans across the fibre/wireless network terminating on the PA firewall). Other than the Router ID, and neighbour config, the OSPF setup on all the firewalls is virtually identical (everything is in Area 0).
We haven't had any issues (that we know of) with the above setup, although we do understand that it's sub-optimal (we're looking at what it would take to have all of the OSPF links terminate on the L3 switch instead, such that the district firewall stops being a router too).
So, should I be worried about the OSPF adjacencies flapping? Should I spend time on figuring those out? Or are they a red herring to some other issue?
Most of the OSPF "outages" are under 10 seconds. The only ones that are longer (3-5 minutes) are for the school that lost connectivity completely 3 times in the last two days (but, not sure if that's the cause or just a symptom).
03-22-2018 05:10 PM - edited 03-22-2018 05:16 PM
03-22-2018 05:24 PM
Can you tell from the logs which side is tearing down the adjacency?
I am still interested in the BFD configuration. Of the models mentioned, the 3020 is the only firewall that supports it and your adjacency flapping mirrors a situation I had where BFD was disabled on one end of a link.
03-22-2018 05:51 PM
03-22-2018 06:45 PM
Agreed - I perused the documentation, and while your central firewall is the only one that supports it (only 3000 and larger or VM models, introduced in 7.1), it appears the default behavior in 7.1 was for it to be disabled.
Still easy to check under network -> virtual routers -> your VR -> OSPF and network -> BFD Profiles just to be sure
03-26-2018 08:58 AM
BFD Profile is listed as "Inherit-vr-global-setting" on all the Virtual Routers configured on the district firewall.
The lone PA3020 in a school is running PanOS 6.1.10, so it doesn't have the BFD settings.
03-26-2018 03:12 PM
Hello,
Original issue aside, you might want to think about upgrading to a newer release. There have been many vulnerabilities patched as well as stability introduced, including OSPF.
Regards,
03-26-2018 03:22 PM
It's a case of "you are free to run whatever PanOS release you want, but if it's not one of the recommended releases, we won't support you". 🙂 Being a school district, we have a fair bit of autonomy on how we run our networks. But if we want support from the Ministry of Education, the provincial team that originally designed/implemented things for all districts, and the provincial helpdesk, then we need to run specific versions on the firewalls (plus or minus a bit).
But, I'll bring this up as something to look at over the summer months. I think a re-architecting of the network is in order, now that we have more experience with things. 🙂 For example, it would be nice to move the OSPF off the district firewall completely, and put it onto the router in front. That way, only traffic for the district data centre would go through the district firewall, and all school Internet traffic would by-pass it completely. Currently, schools on the telco network are "in front" of the district firewall (routed directly to the Internet), while the schools on our fibre/wireless network use the district firewall as a router (no Security Policies affect Internet traffic from those schools). And to move to a single subnet for the OSPF endpoints on the fibre/wireless network. And maybe play with the OSPF timeouts a bit. And maybe enable QoS prioritisation of the routing protocol traffic on the switches/Ubiquiti gear on the fibre/wireless network.
Lots of ideas here to investigate. 🙂
Click Accept as Solution to acknowledge that the answer to your question has been provided.
The button appears next to the replies on topics you’ve started. The member who gave the solution and all future visitors to this topic will appreciate it!
These simple actions take just seconds of your time, but go a long way in showing appreciation for community members and the LIVEcommunity as a whole!
The LIVEcommunity thanks you for your participation!