- Access exclusive content
- Connect with peers
- Share your expertise
- Find support resources
01-21-2019 09:43 PM
Hello,
Using 3020 HA pair. We are currently having two issues regarding fail-over:
Both the issues have been observed since PAN-OS 7.1.10. Gone through several iterations of firmware upgrade and currently on 8.1.3, however, no change noticed.
Next device on the network after the firewalls are a Cisco Nexus stack.
Monitor Fail Hold Down Time (min)=1
Monitor Hold Time (ms)=3000
Any idea what is going on here?
02-07-2019 01:49 PM
Hi All,
Just wanted to let you all know that TAC team has assisted on this issue. Below files/info were collected and after analyzing them the conclusion was: Seems to be an external issue, PA ARP requests/replies are not delivered to end host.
-Packet capture for Non-IP traffic on both the firewalls. Perform this packet capture while performing the failover. We want to see whether new primary firewall is sending GARP immediately or not.
-Keep a continuous ping running through HA and include this in packet filter for above capture. One filter will capture all non-IP traffic and other filter would be for ping.
-Perform a failover, write down timestamp, time required to recover and minutes of outage.
-Collect packet captures, session output for ping(from host machine), global counters.
-Tech Support files
Closing this post now. Client will check connected switches and devices to understand why ARP replies/requests are not delivered to end host.
01-22-2019 06:39 AM
@FarzanaMustafa wrote:
Next device on the network after the firewalls are a Cisco Nexus stack.
Monitor Fail Hold Down Time (min)=1
Monitor Hold Time (ms)=3000
Any idea what is going on here?
Nexus "stack?" Can you ellaborate on the network architecure and how your HA interfaces are incorporated into the network?
The HA interfaces should be in a L2 VLAN, with no other ports anywhere on your network in that VLAN. The HA interfaces themselves should just be normal access VLANs.
01-22-2019 08:24 AM
Hello,
What are you using as your test? Are you putting the active into suspend? Are you using ACI?
Please advise,
01-23-2019 05:55 PM
Hi @OtakarKlier and @Brandon_Wertz
HA is configured directly from one firewall to another without any network devices in between. We have four cables between the firewalls - two of them are used as primary HA links (Control + Data), and two ethernet interfaces are configured as backup HA interfaces (one for Control backup, and one for Data link backup, interfaces are not tagged).
We manually suspend the primary firewall to fail-over to the secondary, then make the first one active again, and suspend the secondary to fail-over back primary.
01-24-2019 09:17 AM
Hello,
Are your Nexus in vPC? I have a similar setup and my failover is almost instantanious. Maybe open a tac case to make sure everything is running as it should?
Regards,
01-24-2019 10:44 AM
@FarzanaMustafa wrote:Hi @OtakarKlier and @Brandon_Wertz
HA is configured directly from one firewall to another without any network devices in between. We have four cables between the firewalls - two of them are used as primary HA links (Control + Data), and two ethernet interfaces are configured as backup HA interfaces (one for Control backup, and one for Data link backup, interfaces are not tagged).
We manually suspend the primary firewall to fail-over to the secondary, then make the first one active again, and suspend the secondary to fail-over back primary.
It makes no sense what-so-ever that you would have anything other than a milisecond failover on firewalls that are directly connected to each other. Let alone multi-minute outages.
My deployment is acorss an OTV WAN link hundreds of miles away and our failover is instaneous.
02-07-2019 01:49 PM
Hi All,
Just wanted to let you all know that TAC team has assisted on this issue. Below files/info were collected and after analyzing them the conclusion was: Seems to be an external issue, PA ARP requests/replies are not delivered to end host.
-Packet capture for Non-IP traffic on both the firewalls. Perform this packet capture while performing the failover. We want to see whether new primary firewall is sending GARP immediately or not.
-Keep a continuous ping running through HA and include this in packet filter for above capture. One filter will capture all non-IP traffic and other filter would be for ping.
-Perform a failover, write down timestamp, time required to recover and minutes of outage.
-Collect packet captures, session output for ping(from host machine), global counters.
-Tech Support files
Closing this post now. Client will check connected switches and devices to understand why ARP replies/requests are not delivered to end host.
Click Accept as Solution to acknowledge that the answer to your question has been provided.
The button appears next to the replies on topics you’ve started. The member who gave the solution and all future visitors to this topic will appreciate it!
These simple actions take just seconds of your time, but go a long way in showing appreciation for community members and the LIVEcommunity as a whole!
The LIVEcommunity thanks you for your participation!