Enhanced Security Measures in Place:   To ensure a safer experience, we’ve implemented additional, temporary security measures for all users.

OSPF adjacency flapping - normal?

cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Announcements

OSPF adjacency flapping - normal?

L4 Transporter

While trying to track down the cause for 3 recent Internet outages we've experienced at one of our schools (which we still haven't determined the cause to yet), we've noticed that our OSPF adjacencies are flapping up and down across the district.  Multiple times per day, across multiple sites, going back to the beginning of last month (that's as far back as the logs go on the district core firewall).

 

Is this normal or something we should be concerned with?  Could this be the reason we get hiccups in our connections to the schools (where you can be typing in an SSH session and suddenly all the characters stop appearing for 10 seconds then appear slowly then appear normally again) when network usage for the school is fairly low?  Could this be the reason for 5-10 minute outages like we've experienced the past two days (nothing showing in the logs on the fibre switches, no links up/down, no STP outages, etc)?  Could this get to the point where our entire WAN goes down?

 

I'm very new to OSPF and routing protocols in general, coming from a static routing background dealing only with the connections on the "inside" of the telco router at a remote site (each site with their own connection to the Internet).  We've since migrated to a proper WAN setup using OSPF internally, with a single connection to the public Internet for the whole district.

 

Our WAN consists of 3 separate networks that all terminate at the district office:  an MPLS link with the local telco for the out-of-town schools, a point-to-point fibre network in town, and a point-to-point wireless network for schools we can't reach with fibre yet.  For the MPLS links, the OSPF is established between an L3 switch in the district office (upstream from the district firewall) and the PA firewall in the school.  For the fibre and wireless networks, the OSPF is established between the PA firewall in the district office and the PA firewall in the school (we use a layer 2 vlans across the fibre/wireless network terminating on the PA firewall).  Other than the Router ID, and neighbour config, the OSPF setup on all the firewalls is virtually identical (everything is in Area 0).

 

We haven't had any issues (that we know of) with the above setup, although we do understand that it's sub-optimal (we're looking at what it would take to have all of the OSPF links terminate on the L3 switch instead, such that the district firewall stops being a router too).

 

So, should I be worried about the OSPF adjacencies flapping?  Should I spend time on figuring those out?  Or are they a red herring to some other issue?

 

Most of the OSPF "outages" are under 10 seconds.  The only ones that are longer (3-5 minutes) are for the school that lost connectivity completely 3 times in the last two days (but, not sure if that's the cause or just a symptom).

21 REPLIES 21

Each school has exactly one of adjacency to one neighbor.

28 school firewalls neighbour with the district firewall via our private fibre/wireless network. There are no routers in these schools. The adjacencies for these ospf connections are flapping.

20-odd school firewalls neighbour with the Telco router in the school. Which then neighbours with the Telco router in our datacentre via the Telco MPLS network. These ospf adjacencies are not flapping.

Oh, PanOS 6.1.19 on the school firewalls, and 7.1.14 on the district firewall. School firewalls are PA200s (elementary) , PA500s (secondary), and a single PA3020 in the multi-building high school. The district firewall is a pair of PA3020s in active/passive HA.

Sorry for typos and multiple edits, posting from a phone. 🙂

Can you tell from the logs which side is tearing down the adjacency?

 

I am still interested in the BFD configuration.  Of the models mentioned, the 3020 is the only firewall that supports it and your adjacency flapping mirrors a situation I had where BFD was disabled on one end of a link.

That's something I haven't looked into yet. Won't be back at work until Monday. I can check the logs then to see which side losses adjacency first (district firewall or school firewall).

BFD is not configured anywhere (at least, I don't think it is).

Agreed - I perused the documentation, and while your central firewall is the only one that supports it (only 3000 and larger or VM models, introduced in 7.1), it appears the default behavior in 7.1 was for it to be disabled. 

Still easy to check under network -> virtual routers -> your VR -> OSPF and network -> BFD Profiles just to be sure

BFD Profile is listed as "Inherit-vr-global-setting" on all the Virtual Routers configured on the district firewall.

 

The lone PA3020 in a school is running PanOS 6.1.10, so it doesn't have the BFD settings.

Hello,

Original issue aside, you might want to think about upgrading to a newer release. There have been many vulnerabilities patched as well as stability introduced, including OSPF.

 

Regards,

It's a case of "you are free to run whatever PanOS release you want, but if it's not one of the recommended releases, we won't support you".  🙂  Being a school district, we have a fair bit of autonomy on how we run our networks.  But if we want support from the Ministry of Education, the provincial team that originally designed/implemented things for all districts, and the provincial helpdesk, then we need to run specific versions on the firewalls (plus or minus a bit).

 

But, I'll bring this up as something to look at over the summer months.  I think a re-architecting of the network is in order, now that we have more experience with things.  🙂  For example, it would be nice to move the OSPF off the district firewall completely, and put it onto the router in front.  That way, only traffic for the district data centre would go through the district firewall, and all school Internet traffic would by-pass it completely.  Currently, schools on the telco network are "in front" of the district firewall (routed directly to the Internet), while the schools on our fibre/wireless network use the district firewall as a router (no Security Policies affect Internet traffic from those schools).  And to move to a single subnet for the OSPF endpoints on the fibre/wireless network.  And maybe play with the OSPF timeouts a bit.  And maybe enable QoS prioritisation of the routing protocol traffic on the switches/Ubiquiti gear on the fibre/wireless network.

 

Lots of ideas here to investigate.  🙂

  • 15083 Views
  • 21 replies
  • 0 Likes
Like what you see?

Show your appreciation!

Click Like if a post is helpful to you or if you just want to show your support.

Click Accept as Solution to acknowledge that the answer to your question has been provided.

The button appears next to the replies on topics you’ve started. The member who gave the solution and all future visitors to this topic will appreciate it!

These simple actions take just seconds of your time, but go a long way in showing appreciation for community members and the LIVEcommunity as a whole!

The LIVEcommunity thanks you for your participation!