HA Active‑Passive 3420 Both Nodes Stuck – Suspecting LACP Issue

unibg_it · ‎05-13-2025

Hello,

Yesterday our HA infrastructure on a pair of Palo Alto PA‑3420 (Active‑Passive) firewalls completely froze. Both units continued to believe they were the active peer, and automatic failover never occurred. We had to manually reboot the actual active node to restore service. We suspect the root cause is related to LACP on our aggregated interfaces.

1. Configuration Details
Model: PA‑3420

PAN‑OS Version: 11.1.6‑h3

HA Mode: Active‑Passive

HA Group: 25

Network Interfaces:

4 physical ports aggregated into AE‑group ae1 (Ethernet1/13‑16)

LACP enabled, “fast” rate

Preemptive: Disabled

2. Problem Description
Both peers continuously report themselves as “active.”

Traffic halts because no clean role transition occurs.

Manual reboot of the true active node is required to recover.

We have already disabled preemptive, but the behavior persists.

3. Relevant Log Messages
critical - link down description contains 'LACP interface ethernet1/13 moved out of AE-group ae1. Selection state Unselected (Link down)'
critical - description contains 'HA Group 25: Can't synchronize control plane data; some state may be lost on switchover'
4. What We’ve Checked So Far
LACP Status: All ports show “collecting/distributing” when up

Cabling & Switches: Verified with loopback tests—no physical-layer errors

Software Version: Running latest H3 for 11.1.6

HA Config: PSK, HA IPs, and settings matched on both peers

5. Questions
Which LACP or HA CLI parameters can we tweak to prevent AE‑group flapping during failover?

Are there any known bugs in 11.1.6‑h3 affecting HA synchronization or LACP?

Any recommended workarounds or best practices for stabilizing AE‑groups in HA setups?

Thanks in advance for any guidance or suggestions!

reaper · ‎05-14-2025

are the firewalls "flapping" or are they both active at the same time?

in the latter case, this is due to a HA1 problem and you should ensure the HA1 link is up and reliable, and you have a backup interface or have set the "Heartbeat Backup" enabled on both peers

also make sure there are no other clusters in the same management network with the same Group ID

Tom Piens
PANgurus - Strata & Prisma Access specialist

Unlock your full community experience!

HA Active‑Passive 3420 Both Nodes Stuck – Suspecting LACP Issue

HA Active‑Passive 3420 Both Nodes Stuck – Suspecting LACP Issue

Show your appreciation!