- Access exclusive content
- Connect with peers
- Share your expertise
- Find support resources
02-03-2019 01:01 AM
HI Team,
I have issue in Palo alto firewall 3260 where HA1 backup link went down. Eventhough there is no production impact i'm seeing this issue happened without any cable change or any activity.
This is due to ping failure for heart beat , But I want to know what caused this ping failure issue.
I have already running PANOS 8.1.4-h2 which says release notes that HA1 Backup port issue unexpected behaviour was fixed.
Below is error message
Error Msg
---------
flags : 0x2 (close:)
err code : Heartbeat ping failure (16)
num tlvs : 1
Printing out 1 tlvs
TLV[1]: type 5 (ERR_STRING); len 23; value:
48656172 74626561 74207069 6e672066 61696c75 726500
Regards
Venky
02-05-2019 05:53 AM
your issue starts at the first line
2019-01-29 11:51:33.447 +0400 debug: ha_sysd_haX_link_change(src/ha_sysd.c:2221): Seeing HA1-Backup peer link down, waiting hold
'this' peer reports the remote end is down
so you now need to check the corresponding timeframe at the remote end
02-05-2019 06:02 AM
This is at passive firewall
2019-01-29 11:51:33.229 +0400 Error: ha_ping_peer_miss(src/ha_ping.c:756): Missed 1 ping timeouts out of 3 (ha1-backup)
2019-01-29 11:51:33.257 +0400 debug: ha_peer_recv_hello(src/ha_peer.c:5119): Group 1 (HA1-MAIN): Receiving hello message
Msg Hdr
-------
version : 1
groupID : 1
type : Hello (2)
token : 0xb32d
flags : 0x1 (req:)
length : 122
Hello Msg
---------
flags : 0x0 ()
state : Active (5)
priority : 100
cookie : 55493
num tlvs : 3
Printing out 3 tlvs
TLV[1]: type 62 (CONFIG_MD5_PRE); len 33; value:
65313361 38313135 34623561 32633139 64353536 33313363
32383039 37616236 00
TLV[2]: type 2 (CONFIG_MD5SUM); len 33; value:
35373338 35623065 36663138 38313537 39616161 66326530
65396232 33376561 00
TLV[3]: type 11 (SYSD_PEER_DOWN); len 4; value:
00000000
2019-01-29 11:51:33.257 +0400 debug: ha_state_cfg_md5_set(src/ha_state_cfg.c:465): We were in sync and now we are out of sync; autocommit no; ha-sync no; panorama no; cfg-sync-off no; pre-old-insync yes; pre-new-insync no
2019-01-29 11:51:33.257 +0400 debug: ha_sysd_dev_cfgsync_update(src/ha_sysd.c:1415): Set dev cfgsync to Committing
2019-01-29 11:51:33.257 +0400 debug: ha_state_cfg_from_insync_to_outsync(src/ha_state_cfg.c:673): peer group 1 has changed the md5, waiting for an update
2019-01-29 11:51:33.447 +0400 debug: ha_peer_recv_hello(src/ha_peer.c:5119): Group 1 (HA1-MAIN): Receiving hello message
It is the one which said missed one ping time out, I'm not seeing any ping attempt in primary firewall, once Passive firewall miss 4 time out it went down forever.
Then I selected the same interface HA-1 B and committed then it came up still stable.
02-05-2019 06:07 AM
ok so while one peer sees interface down thge other sees a missed ping but ALSO is in the process of committing a config:
2019-01-29 11:51:33.257 +0400 debug: ha_sysd_dev_cfgsync_update(src/ha_sysd.c:1415): Set dev cfgsync to Committing
is it possible a config change related to the HA1-b interface was being pushed? there is always a small config sync gap in between the active member committing and the passive unit receiving and committing
02-05-2019 06:14 AM
NO dude, I have seen config changes but no such changes.
Once after the port issue only the did change.
02-05-2019 06:30 AM
something was being committed, so that could have caused the interface to bounce or possibly resources were drained somehow
02-09-2019 11:07 PM
HI @reaper,
Yes , You are correct commit has been done @ 11:53AM on 29th but this related to address object configuration mapping to address group and then calling in source addres of policy.
i'm more curious how this affect HA port configured somewhere in my firewall.
Regards
Venky
02-10-2019 05:42 AM
HI @reaper
I have seen one more interesting thing, The HA-B port was dropping packets. which happened in primary firewall.
So my issue is in active firewall which dropped the packets so HA1-B went down. SInce I have the tech support file generated after clearing the issue. I'm not able to see the memory during time of issue.
Interface: ha1-b
-------------------------------------------------------------------------------
Logical interface counters:
-------------------------------------------------------------------------------
bytes received 207647488
bytes transmitted 214917298
packets received 4254056
packets transmitted 4261401
receive errors 0
transmit errors 0
receive packets dropped 10769
transmit packets dropped 0
multicast packets received 0
-----------------------------------------
02-11-2019 02:49 AM
Don't focus too much on these numbers until you can directly correlate them to the actual event. some packets may get dropped naturally, or they could have been from a previous issue (possibly during initial config)
since the connection was impacted during the commit you'll need to look at both techsupport files side by side starting secondas before the commit starts, see if there are unusual; spikes in MP or DP cpu, those drop counters should be correlated for their delta during the commit (does the number increase gradually over time, or does it spike during the commit)
the content of the commit may not necessarily be related to the interface itself, it's possible something during the commit chokes the interfaces for some reason
do you have as support case open already? If not,m this may be a good time to do so
02-13-2019 09:34 PM
HI @reaper
I have case opened with TAC and they are researching on root casuse. I will keep you posted once I get update.
Thank you so much for all your analysis for betterment in investigation.
Regards
Venky
06-06-2019 02:50 PM
Any updates on this case? I've got the same issue on PA-3220 with PAN-OS 8.1.8.
I see the symptom precisely like you that 'receive packets dropped' increased on Active firewall. I'm going to open a case with TAC.
06-06-2019 02:53 PM
Hi
this is a known bug gonna fixed in 8.1.9 or 9.0 version . You can wait for 8.1.9 or can upgrade to 9.0
06-06-2019 02:59 PM
Thanks for your reply!
Do you have the bug/issue ID? Or is this non-public one?
06-06-2019 06:57 PM - edited 06-06-2019 06:58 PM
FYI -
Here is a workaround for someone who wants to bring up the HA1 Backup before upgrading the PAN-OS.
Step 1. Change the Port type from ha1-b to management on Active firewall and Commit (Device -> High Availability -> General > Control link (HA1 Backup)
Step 2. Revert back to the previous configuration with the Port type: ha1-b, along with the IP address and Commit.
This workaround should bring up the HA1 Backup.
Hope this helps!
Click Accept as Solution to acknowledge that the answer to your question has been provided.
The button appears next to the replies on topics you’ve started. The member who gave the solution and all future visitors to this topic will appreciate it!
These simple actions take just seconds of your time, but go a long way in showing appreciation for community members and the LIVEcommunity as a whole!
The LIVEcommunity thanks you for your participation!