08-12-2013 02:16 PM
We've got a Cisco 7301 routers that forms OSPF adjacencies with an HA pair of 5020 firewalls. Recently I swapped this router out with a different router with the same IPs but different configs to test a new WAN connection. OSPF forms up just fine with the new router. After testing concluded and swapping back to the old router OSPF freaks out. The adjacencies get stuck in EXSTART. Cisco also says that this is an MTU mismatch condition, not true in my case.
Failing the firewalls over did not clear this up, tried twice. Rebooting them did the trick. After the reboot and a failover the adjacency was just fine.
Aug 11 00:15:14: %OSPF-5-ADJCHG: Process 200, Nbr 10.16.0.12 on GigabitEthernet0/0.200 from EXSTART to DOWN, Neighbor Down: Too many retransmissions
Aug 11 00:16:04: %OSPF-5-ADJCHG: Process 200, Nbr 10.16.1.20 on GigabitEthernet0/0.200 from DOWN to DOWN, Neighbor Down: Ignore timer expired
Aug 11 00:17:14: %OSPF-5-ADJCHG: Process 200, Nbr 10.16.0.12 on GigabitEthernet0/0.200 from EXSTART to DOWN, Neighbor Down: Too many retransmissions
Wondering if anyone else has seen this type of issue, or at least has any suggestions on how to get the adjacency to form without having to reboot.
11-27-2013 11:56 PM
Because we met the case yesterday night, you may find out that your ospf packets are dropped by the Palo Alto (have a look in your traffic logs). In that case, you should explicit a policy rule to authorize ospf traffic.
Hope it helps even if i'm sure you've found a solution since august !
08-31-2020 06:46 AM
Sorry if I bring up such an old topic, but I am encountering a similar problem. I have two PA5220 (HA active/standby pair) and 4 Cisco C3850 switch pairs (4x2-way VSSs). PanOS = 9.0.9-h1, Cisco IOS 16.9.4. The entire setup is dual-stack IPv4/IPv6 and I am using OSPF for IPv4 and OSPFv3 for IPv6, due to PA limitation on dual-stack OSPFv3. I am attaching a diagram with a sample configuration. The core switches host a total of 9 VRFs, each with its own uplink, and all uplinks are transported on the same Po/Ae trunks. Each VRF pair (core A, core B) has its own Area (normal), with the firewall is the designated router (DR). VRF OSPF processes have their priority set to 0, so they won't take part in the election. My failover process is not the "standard" one (i.e. make device inactive), I'd rather lower the standby fw priority and let it preempt the active.
Now, if I force a failover, CoreA does everything right. Core B encounters this very same error: Neighbor Down: Too many retransmissions and Neighbor Down: Ignore timer expired. I can fix it by disabling/re-enabling CoreB's interface vlans, one at a time, as if they had some kind of "bottleneck" problems (we are talking about 2x10Gbit links, 282 IPv4 routes). OSPF traffic is allowed intra-zone (OSPF Area = firewall Zone = 1 firewall interface vlan + 2 core interface vlan = a bunch of networks on the cores)
I removed the mtu-ignore command on Cisco side (but I might add it back), and all OSPF routers have graceful restart enabled.
I have two questions:
1) is there a way I can avoid these errors? am I doing something wrong?
2) could LLDP being enabled on both the firewall(s) and the switches interfere in all of this, by enabling a "higher level" negotiation between core and firewall, and disabling a "virtual mac address" failover mechanism which would avoid me the entire neighborship calculation?
04-01-2021 03:14 AM
On my setup, this problem was (probably) caused by PAN-154899 bug. Upgrading from 9.1.6 to 9.1.8 finally made the issue disappear.
04-01-2021 01:56 PM
My recent OSPF issues came about when some network engineers sent my traffic down a WAN link with different MTU's. Might be worth a look.
04-04-2021 01:20 AM - edited 04-04-2021 01:21 AM
The firewall was (and still is) directly connected to its L3 peers, which I manage as well. I've tried the MTU fix, but it did not help. Reducing the amount of exchanged information (splitting areas and using NSSAs) helped a bit, making the issue less frequent, but it still occurred from time to time. Upgrade to 9.1.8 resulted in the first two consecutive forced failovers without any issue at all.
Click Accept as Solution to acknowledge that the answer to your question has been provided.
The button appears next to the replies on topics you’ve started. The member who gave the solution and all future visitors to this topic will appreciate it!
These simple actions take just seconds of your time, but go a long way in showing appreciation for community members and the LIVEcommunity as a whole!
The LIVEcommunity thanks you for your participation!