HA queue full

SOC_CSG · ‎08-10-2014

Hi, im receiving this snmp trap in my Palo Alto (PA-3020 PANOS 6.0.3). Checking the system logs i see each 15 mins this log message "HA-queue-full". Why is this happening?

HULK · ‎08-10-2014

Hello COS,

This message usually indicates that the HA buffer is full due to a communication breakdown on the HA1 link. But, i had a previous experience with exactly the same messages due to a USER-ID group mapping problem in HA environment.

So, could you please confirm, if the PAN firewall is configured with group-mapping...? If so, please verify that a valid filter with groups by limiting groups in your include list.

The PAN firewall should configure with a valid include list.

Thanks

SOC_CSG · ‎08-11-2014

Yes, we have group-mapping but nobody in ITSystems has changed anything in AD. how could i touch the buffer or something to fix it???

HULK · ‎08-11-2014

Hello COS,

Could you please "show system resources" on this PAN firewall, just to ensure resources are available on MP.

Thanks

SOC_CSG · ‎08-11-2014

i dont see any strange thing.....i can see the useridd at then end, and the CPU is not very used..

SOC_CSG · ‎08-14-2014

any new idea??

HULK · ‎08-14-2014

Could you please check mp\useridd.log ( less mp-log userid.log) during the same time, while you have received "HA queue is full" messages.

SOC_CSG · ‎08-14-2014

the alarm was reported by SNMP at 10:12:23. dc3 is not active but i dont think this is connected with the problem.

This is the useridd.log

2014-08-14 10:11:38.531 +0200 Error: pan_user_id_win_sess_query(pan_user_id_win.c:1471): session query for dc3.cuatrocruces.com failed: [wmi/wmic.c:200:main()

] ERROR: Login to remote object.

2014-08-14 10:11:43.610 +0200 Error: pan_user_id_win_log_query(pan_user_id_win.c:1323): log query for dc3.cuatrocruces.com failed: [wmi/wmic.c:200:main()] ERR

OR: Login to remote object.

2014-08-14 10:11:49.792 +0200 Error: pan_user_id_win_log_query(pan_user_id_win.c:1323): log query for dc3.cuatrocruces.com failed: [wmi/wmic.c:200:main()] ERR

OR: Login to remote object.

2014-08-14 10:11:55.867 +0200 Error: pan_user_id_win_sess_query(pan_user_id_win.c:1471): session query for dc3.cuatrocruces.com failed: [wmi/wmic.c:200:main()

] ERROR: Login to remote object.

2014-08-14 10:12:00.956 +0200 Error: pan_user_id_win_log_query(pan_user_id_win.c:1323): log query for dc3.cuatrocruces.com failed: [wmi/wmic.c:200:main()] ERR

OR: Login to remote object.

2014-08-14 10:12:06.825 +0200 Error: pan_user_id_win_log_query(pan_user_id_win.c:1323): log query for dc3.cuatrocruces.com failed: [wmi/wmic.c:200:main()] ERR

OR: Login to remote object.

2014-08-14 10:12:16.311 +0200 Error: pan_user_id_win_sess_query(pan_user_id_win.c:1471): session query for dc3.cuatrocruces.com failed: [wmi/wmic.c:200:main()

] ERROR: Login to remote object.

2014-08-14 10:12:21.405 +0200 Error: pan_user_id_win_log_query(pan_user_id_win.c:1323): log query for dc3.cuatrocruces.com failed: [wmi/wmic.c:200:main()] ERR

OR: Login to remote object.

2014-08-14 10:12:23.783 +0200 Error: pan_user_id_ha_pre_send(pan_user_id_ha.c:549): HA queue (0/50000) left

2014-08-14 10:12:27.400 +0200 Error: pan_user_id_win_log_query(pan_user_id_win.c:1323): log query for dc3.cuatrocruces.com failed: [wmi/wmic.c:200:main()] ERR

OR: Login to remote object.

2014-08-14 10:12:33.392 +0200 Error: pan_user_id_win_sess_query(pan_user_id_win.c:1471): session query for dc3.cuatrocruces.com failed: [wmi/wmic.c:200:main()

] ERROR: Login to remote object.

2014-08-14 10:12:38.473 +0200 Error: pan_user_id_win_log_query(pan_user_id_win.c:1323): log query for dc3.cuatrocruces.com failed: [wmi/wmic.c:200:main()]

tHANKS

HULK · ‎08-14-2014

014-08-14 10:12:23.783 +0200 Error: pan_user_id_ha_pre_send(pan_user_id_ha.c:549): HA queue (0/50000) left >>>>>>>>>>>>> This is the problem.

Is there any core file on this firewall..?

Thanks

HULK · ‎08-14-2014

Is this a HA Active/Active setup...?

Thanks

SOC_CSG · ‎08-14-2014

No, its ACTIVE/PASSIVE. these are the core files but i dont know exactly when the problem started.

admin@FW1(active)> show system files

/opt/dpfs/var/cores/:

total 4.0K

drwxrwxrwx 2 root root 4.0K Aug 6 10:11 crashinfo

/opt/dpfs/var/cores/crashinfo:

total 0

/var/cores/:

total 11M

drwxrwxrwx 2 root root 4.0K Aug 6 23:55 crashinfo

-rw-r--r-- 1 root root 11M Aug 7 00:00 useridd_6.0.3_0.tar.gz

/var/cores/crashinfo:

total 60K

-rw-rw-rw- 1 root root 54K Aug 6 23:55 useridd_6.0.3_0.info

HULK · ‎08-14-2014

Since this issue is related to the user-ID daemon, i would request you to open a case with PAN support. We need to take a look into the USER-ID core file.

Thanks

SOC_CSG · ‎08-14-2014

I have checked the MONITOR SYSTEM LOG with the core-file date (Aug 6 23:55 crashinfo) and i have found this log.

useridd:exiting because missed too many heartbeats.......i think this was the origin of the error....what could i do????

SOC_CSG · ‎08-14-2014

Apparently we arent feeling anything weird....just SNMP TRAPS abut the proble each 15minutes.

Unlock your full community experience!

HA queue full

HA queue full

Show your appreciation!