Dataplane goes restarted

Joshan_Lakhani · ‎01-28-2021

i have a paloalto 3220 model After plug the new SPF all the interface port goes down as well as dataplane goes restart.

Once i unplug the SFP again dataplane goes restarts

All the interface are goes down

HA Logs:

2021-01-27 16:28:07.512 +0500 debug: ha_sysd_general_vers_string(src/ha_sysd_version.c:1835): Got new Threat Content: 8369-6522; for peer value
2021-01-27 16:28:07.512 +0500 HA Group 10: Threat Content version now matches
2021-01-27 16:28:07.513 +0500 HA peer Anti-Virus set to Match
2021-01-27 16:28:07.513 +0500 HA peer Application Content set to Match
2021-01-27 16:28:07.513 +0500 HA peer Global Protect Client Software set to Match
2021-01-27 16:28:07.513 +0500 HA Group 10: IOT Content version mismatch due to device update
2021-01-27 16:28:07.513 +0500 HA peer Build Release set to Match
2021-01-27 16:28:07.513 +0500 HA peer Threat Content set to Match
2021-01-27 16:28:07.513 +0500 HA peer URL Database set to Mismatch
2021-01-27 16:28:07.513 +0500 HA peer URL Vendor set to Match
2021-01-27 16:28:07.513 +0500 HA peer VM License Type set to Mismatch
2021-01-27 16:28:07.513 +0500 HA peer VPN Client Software set to Match
2021-01-27 16:28:07.514 +0500 HA peer DLP set to Match
2021-01-27 16:28:07.514 +0500 debug: ha_sysd_general_vers_string(src/ha_sysd_version.c:1835): Got new IOT Content: 16-253; for peer value
2021-01-27 16:28:07.514 +0500 HA Group 10: IOT Content version now matches
2021-01-27 16:30:04.958 +0500 debug: ha_slot_sysd_dp_down_notify_cb(src/ha_slot.c:1061): Got initial dataplane down (slot 1; reason brdagent exiting)
2021-01-27 16:30:04.959 +0500 The dataplane is going down
2021-01-27 16:30:04.959 +0500 debug: ha_rts_dp_ready_update(src/ha_rts.c:1122): RTS slot 1 set to NOT ready
2021-01-27 16:30:04.959 +0500 debug: ha_rts_dp_ready(src/ha_rts.c:790): Update dp ready bitmask for slots ; changed slots 1 for local device
2021-01-27 16:30:04.959 +0500 debug: ha_peer_send_hello(src/ha_peer.c:5483): Group 10 (HA1-MAIN): Sending hello message

Hello Msg
---------
flags : 0x1 (preempt:)
state : Active (5)
priority : 100
cookie : 45077
num tlvs : 2
Printing out 2 tlvs
TLV[1]: type 67 (DP_RTS_READY); len 4; value:
00000000
TLV[2]: type 11 (SYSD_PEER_DOWN); len 4; value:
00000000

2021-01-27 16:30:04.960 +0500 Warning: ha_event_log(src/ha_event.c:59): HA Group 10: Dataplane is down: brdagent exiting
2021-01-27 16:30:04.960 +0500 Going to non-functional for reason Dataplane down: brdagent exiting
2021-01-27 16:30:04.960 +0500 debug: ha_state_transition(src/ha_state.c:1430): Group 10: transition to state Non-Functional
2021-01-27 16:30:04.960 +0500 debug: ha_state_start_monitor_holdup(src/ha_state.c:2735): Skipping monitor holdup for group 10
2021-01-27 16:30:04.960 +0500 debug: ha_state_monitor_holdup_callback(src/ha_state.c:2833): Going to Non-Functional state state
2021-01-27 16:30:04.961 +0500 debug: ha_state_move(src/ha_state.c:1532): Group 10: moving from state Active to Non-Functional
2021-01-27 16:30:04.961 +0500 Warning: ha_event_log(src/ha_event.c:59): HA Group 10: Moved from state Active to state Non-Functional
2021-01-27 16:30:04.961 +0500 debug: ha_sysd_dev_state_update(src/ha_sysd.c:1549): Set dev state to Non-Functional
2021-01-27 16:30:04.961 +0500 debug: ha_state_move_action(src/ha_state.c:1335): No state transition script available on current platform
2021-01-27 16:30:04.961 +0500 debug: ha_sysd_dev_alarm_update(src/ha_sysd.c:1515): Set dev alarm to on

Masterd.logs:

2021-01-27 17:08:20.130 +0500 CRITICAL: brdagent: Exited 1 times, must be manually recovered.

2021-01-27 16:36:18.792 +0500 INFO: all: running
2021-01-27 17:06:05.280 +0500 INFO: brdagent: exited, Core: True, Exit signal: SIGBUS
2021-01-27 17:08:18.191 +0500 INFO: brdagent: saved core file brdagent_10.0.2_1.core
2021-01-27 17:08:20.130 +0500 CRITICAL: brdagent: Exited 1 times, must be manually recovered.
2021-01-27 17:08:20.144 +0500 INFO: platform: group exiting because child brdagent exited
2021-01-27 17:08:20.318 +0500 INFO: sysdagent: exited, Core: False, Exit code: 0
2021-01-27 17:08:20.476 +0500 INFO: ehmon: exited, Core: False, Exit code: 0
2021-01-27 17:08:20.663 +0500 INFO: platform: exited
2021-01-27 17:08:20.790 +0500 CRITICAL: platform: Exited 1 times, must be manually recovered.
2021-01-27 17:08:20.809 +0500 INFO: all: group exiting because child platform exited
2021-01-27 17:08:20.854 +0500 INFO: gdb: exited
2021-01-27 17:08:21.453 +0500 INFO: sdwand: exited, Core: False, Exit code: 0
2021-01-27 17:08:21.691 +0500 INFO: bfd: exited, Core: False, Exit code: 0
2021-01-27 17:08:21.840 +0500 INFO: dha: exited, Core: False, Exit code: 0
2021-01-27 17:08:22.022 +0500 INFO: mprelay: exited, Core: False, Exit code: 0
2021-01-27 17:08:22.168 +0500 INFO: tund: exited, Core: False, Exit code: 0
2021-01-27 17:08:22.543 +0500 INFO: pktlog_forwarding: exited, Core: False, Exit code: 0
2021-01-27 17:08:22.701 +0500 INFO: all_pktproc_3: exited, Core: False, Exit code: 0
2021-01-27 17:08:22.883 +0500 INFO: all_pktproc_5: exited, Core: False, Exit code: 0
2021-01-27 17:08:23.043 +0500 INFO: all_pktproc_4: exited, Core: False, Exit code: 0
2021-01-27 17:08:23.198 +0500 INFO: all_pktproc_7: exited, Core: False, Exit code: 0
2021-01-27 17:08:23.350 +0500 INFO: wifclient: exited, Core: False, Exit signal: SIGTERM
2021-01-27 17:08:23.533 +0500 INFO: monitor: exited, Core: False, Exit signal: SIGTERM
2021-01-27 17:08:23.687 +0500 INFO: all_pktproc_6: exited, Core: False, Exit code: 0
2021-01-27 17:08:23.880 +0500 INFO: all_pktproc_8: exited, Core: False, Exit code: 0
2021-01-27 17:08:24.028 +0500 INFO: flow_mgmt: exited, Core: False, Exit code: 0
2021-01-27 17:08:24.173 +0500 INFO: flow_ctrl: exited, Core: False, Exit code: 0
2021-01-27 17:08:24.392 +0500 INFO: tasks: exited
2021-01-27 17:08:24.435 +0500 INFO: comm: exited, Core: False, Exit code: 0
2021-01-27 17:08:25.315 +0500 INFO: dssd: exited, Core: False, Exit code: 0
2021-01-27 17:08:25.476 +0500 INFO: supervisor: exited
2021-01-27 17:08:25.506 +0500 INFO: all: exited

VinceM · ‎01-28-2021

HI all,

after restart, please follow this doc for checking your SFP config: https://knowledgebase.paloaltonetworks.com/KCSArticleDetail?id=kA10g000000ClaMCAS

Test them one by one

If they are not reconized ... tech support then RMA

Rgds

V.

Joshan_Lakhani · ‎01-28-2021

Dear @VinceM

Thanks for you reply

when we plug SFP in 1/19 or 1/20 all the other interface are goes down.

As the data plane also restarts.After unplug the same sfp again dataplane goes down

we have the system log we can also see that .

HA Group 10: Dataplane is down: brdagent exiting

27/01/2021 17:13	general	general	critical	Chassis Master Alarm: Cleared
27/01/2021 17:12	general	general	critical	WildFire update job failed for user Auto update agent
27/01/2021 17:08	ha2-keep-alive	general	critical	HA Group 10: All HA2 keep-alives are down
27/01/2021 17:08	ha2-keep-alive	general	critical	HA Group 10: Peer HA2 keep-alive down
27/01/2021 17:08	general	general	critical	The dataplane is restarting.
27/01/2021 17:08	general	general	critical	all: Exited 1 times, must be manually recovered.
27/01/2021 17:08	general	general	critical	platform: Exited 1 times, must be manually recovered.
27/01/2021 17:08	general	general	critical	brdagent: Exited 1 times, must be manually recovered.
27/01/2021 17:06	ha2-link-change	general	critical	All HA2 links down
27/01/2021 17:06	ha2-link-change	general	critical	HA2 link down
27/01/2021 17:06	general	general	critical	Chassis Master Alarm: HA-event
27/01/2021 17:06	state-change	general	critical	HA Group 10: Moved from state Active to state Non-Functional
27/01/2021 17:06	dataplane-down	general	critical	HA Group 10: Dataplane is down: brdagent exiting
27/01/2021 16:37	general	general	critical	Chassis Master Alarm: Cleared
27/01/2021 16:33	general	general	critical	WildFire update job failed for user Auto update agent
27/01/2021 16:32	general	general	critical	WildFire update job failed for user Auto update agent
27/01/2021 16:32	ha2-keep-alive	general	critical	HA Group 10: All HA2 keep-alives are down
27/01/2021 16:32	ha2-keep-alive	general	critical	HA Group 10: Peer HA2 keep-alive down
27/01/2021 16:32	general	general	critical	The dataplane is restarting.
27/01/2021 16:32	general	general	critical	all: Exited 1 times, must be manually recovered.
27/01/2021 16:32	general	general	critical	platform: Exited 1 times, must be manually recovered.
27/01/2021 16:32	general	general	critical	brdagent: Exited 1 times, must be manually recovered.
27/01/2021 16:30	ha2-link-change	general	critical	All HA2 links down
27/01/2021 16:30	general	general	critical	Chassis Master Alarm: HA-event
27/01/2021 16:30	ha2-link-change	general	critical	HA2 link down
27/01/2021 16:30	state-change	general	critical	HA Group 10: Moved from state Active to state Non-Functional
27/01/2021 16:30	dataplane-down	general	critical	HA Group 10: Dataplane is down: brdagent exiting

t

VinceM · ‎01-28-2021

Hi,

What is your panos version ?

Are sfp coming from palo ? or are they "compatible"

rgds

v.

Joshan_Lakhani · ‎01-28-2021

@VinceM

PAN-OS 10.0.2

and yes both SPF are compatible

SFP Module FINISAR FTLF8519P3BNL & AvaGO AFBR-709SMZ

VinceM · ‎01-28-2021

Hi,

We got on prod PA850 with SFP. We were in same version as you couple of weeks ago and everything works well.

But, we had same issuse as you in the past with arista fiber sfp.

Can you tied with another sfp ?

Can you try with a palo's one ?

From years and years, plao become stricter and stricter with compatible stuff.

V.

BPry · ‎01-28-2021

@Joshan_Lakhani,

Relatively easy to troubleshoot. Take one of your known working SFPs and try to use it in ethernet1/19 or ethernet1/20. If your known working SFP works in the port(s) without triggering a Dataplane reload you have an SFP compatibility issue, if they also cause the Dataplane to crash you have a hardware issue and will need to process an RMA.

I've never had anyone use an Avago in a PA product, so I couldn't tell you if that actually works at all. Your Finisar you should be perfectly fine with.

EzraMosomi · ‎02-03-2021

I had the same issue with PA 3220, was using both Finisar and Cisco Avago, I did the upgrade to PANOS 10.0.0.3 it didn't work, I upgraded to PAN OS 10.0.0.4 and both SFP modules were able to come up and it is not restarting the DP now 🙂

Note that previously even the SFPs were not blinking or up and after the upgrade they came up. When I was doing the command

Spoiler

> show system state filter sys.s1.p13.phy

… I was seeing all the details captured but still not able to show that the module is up.

Unlock your full community experience!

Dataplane goes restarted

Dataplane goes restarted

Show your appreciation!