- Access exclusive content
- Connect with peers
- Share your expertise
- Find support resources
Enhanced Security Measures in Place: To ensure a safer experience, we’ve implemented additional, temporary security measures for all users.
07-02-2024 01:37 AM
Hi,
We have two FW PA460 in HA, one active and another one passive. We have several issues related to configuration synchronization and HA:
1- Synchronization before a commit can take us up to 8 minutes. With the old FW the commit was in less than a minute and with these newer models we have gotten worse. It wouldn't affect us if it wasn't that in cases like FW OS updates we are out of service and we think this time should be improveable by this model
2- When there is a change from passive-active and active-passive we have a network cut of between 3 and 4 minutes. We have verified that it is not the LACP negotiation of the IFs but rather the FWs that are taking all this time to realize the cut or to make the change. HA is not useful to us if it takes 4 minutes to make the change. We bought both PA460's instead of just one so we could have HA and we're not getting the benefit of it. We have been advised that it appears to be a bug on the PA460/400, but after a year (we installed the FWs in June/July 2023) we still have the problem despite receiving updates.
3- We have certificates installed in "ghost" FWs that we don't see in the GUI. We created these a few months ago and they didn't show up once generated, but they are present in the config XML. We see it in two places:
A) When we commit we get warnings indicating that we have 3 duplicate certificates but we only have one in the GUI in "Device > Certificates".
B) From the CLI listing the certificates we have different unique certificates that we don't see in the GUI but we do in the CLI.
Any idea? i already detected PanOS > 9.1 are much lower than previous versions 😞
07-02-2024 01:58 AM
1. 8 minutes is extremely long, what PAN-OS are you using and how big is your configuration file (it displays that in the commit completed popup). are many admins connected at the same time or do you have many scheduled reports set for a short period of time?
2. this is also suspiciously long, failover normally takes milliseconds: what parameters are you using during the failover (are you manually forcing a failure or are these real failures?) what is being monitored (path, interface,...) to trigger failovers?
is LACP set to prenegotiate on the passive device (make sure to set the passive link state to auto)
3) how did you generate these certificates? did they used to be visible but became invisible after an upgrade? you could try to remove them from an exported XML and then reimport them
07-02-2024 02:58 AM
1) have you tried 'revert to running config', to refresh the candidate config? can't say any of my 400's are super slow...
2) those are the default timers, you could set the timers a little more agressively, but that would not explain minutes to fail over ...
Did you set up any Link and Path monitoring?
do change the passive link state to 'auto' so LACP can be prenegotiated
3)ive had this issue appear at another customer, we're considering opening a case at it appears to be a bug (tried several things to fix this but to no avail)
07-09-2024 02:00 AM - edited 07-29-2024 11:54 PM
Issues were solved
07-10-2024 01:38 AM
The HA sync is a second commit that takes place after the commit completed locally, where the config is synced to the peer
This adds some latency to commits as you need to wait for the peer to have completed the commit before your entire opertion is deemed succesful
So this is sorta normal, but you can track this on both sides to see how long it takes for the config to be transferred and how quickly the peer commit starts and finishes
The EDL refresh may add a little latency, as that also happens every commit, to update all the EDLs you're currently using.
The device certificate fetch, however, is a little odd... does it appear after every commit? it should not happen frequently, it is the certificate used to communicate with Palo cloud services and is usually valid for 3 months (at which time it automatically refreshes)
07-10-2024 11:49 PM - edited 07-29-2024 11:53 PM
Thanks as usual Reaper for your answer.
07-11-2024 03:02 AM
if you really really want to know, you can set both the mgmt and devsrv to debugging mode on both nodes, and push a commit (turn off debug mode once it's done) and then go thrawling through the debug logs (> less mp-log ms.log and devsrv.log, or collect a techsupport file from both) to see where there are delays during the commit process
or open a support case and have someone else dig through logs 😉
Click Accept as Solution to acknowledge that the answer to your question has been provided.
The button appears next to the replies on topics you’ve started. The member who gave the solution and all future visitors to this topic will appreciate it!
These simple actions take just seconds of your time, but go a long way in showing appreciation for community members and the LIVEcommunity as a whole!
The LIVEcommunity thanks you for your participation!