HA queue full

cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Announcements
Please sign in to see details of an important advisory in our Customer Advisories area.

HA queue full

L4 Transporter

Hi, im receiving this snmp trap in my Palo Alto (PA-3020 PANOS 6.0.3). Checking the system logs i see each 15 mins this log message "HA-queue-full". Why is this happening?

HA.jpg

13 REPLIES 13

L7 Applicator

Hello COS,

This message usually indicates that the HA buffer is full due to a communication breakdown on the HA1 link. But, i had a previous experience with exactly the same messages due to a USER-ID group mapping problem in HA environment.

So, could you please confirm, if the PAN firewall is configured with group-mapping...? If so, please verify that a valid filter with groups by limiting groups in your include list. 

The PAN firewall should configure with a valid include list.


Thanks

Yes, we have group-mapping but nobody in ITSystems has changed anything in AD. how could i touch the buffer or something to fix it???

cap2.jpg

cap3.jpg

Hello COS,

Could you please "show system resources" on this PAN firewall, just to ensure resources are available on MP.

Thanks

capp11.jpgi dont see any strange thing.....i can see the useridd at then end, and the CPU is not very used..

L4 Transporter

any new idea??

Could you please check mp\useridd.log ( less mp-log userid.log) during the same time, while you have received "HA queue is full" messages.

the alarm was reported by SNMP at 10:12:23. dc3 is not active but i dont think this is connected with the problem.

This is the useridd.log

2014-08-14 10:11:38.531 +0200 Error:  pan_user_id_win_sess_query(pan_user_id_win.c:1471): session query for dc3.cuatrocruces.com failed: [wmi/wmic.c:200:main()

] ERROR: Login to remote object.

2014-08-14 10:11:43.610 +0200 Error:  pan_user_id_win_log_query(pan_user_id_win.c:1323): log query for dc3.cuatrocruces.com failed: [wmi/wmic.c:200:main()] ERR

OR: Login to remote object.

2014-08-14 10:11:49.792 +0200 Error:  pan_user_id_win_log_query(pan_user_id_win.c:1323): log query for dc3.cuatrocruces.com failed: [wmi/wmic.c:200:main()] ERR

OR: Login to remote object.

2014-08-14 10:11:55.867 +0200 Error:  pan_user_id_win_sess_query(pan_user_id_win.c:1471): session query for dc3.cuatrocruces.com failed: [wmi/wmic.c:200:main()

] ERROR: Login to remote object.

2014-08-14 10:12:00.956 +0200 Error:  pan_user_id_win_log_query(pan_user_id_win.c:1323): log query for dc3.cuatrocruces.com failed: [wmi/wmic.c:200:main()] ERR

OR: Login to remote object.

2014-08-14 10:12:06.825 +0200 Error:  pan_user_id_win_log_query(pan_user_id_win.c:1323): log query for dc3.cuatrocruces.com failed: [wmi/wmic.c:200:main()] ERR

OR: Login to remote object.

2014-08-14 10:12:16.311 +0200 Error:  pan_user_id_win_sess_query(pan_user_id_win.c:1471): session query for dc3.cuatrocruces.com failed: [wmi/wmic.c:200:main()

] ERROR: Login to remote object.

2014-08-14 10:12:21.405 +0200 Error:  pan_user_id_win_log_query(pan_user_id_win.c:1323): log query for dc3.cuatrocruces.com failed: [wmi/wmic.c:200:main()] ERR

OR: Login to remote object.

2014-08-14 10:12:23.783 +0200 Error:  pan_user_id_ha_pre_send(pan_user_id_ha.c:549): HA queue (0/50000) left

2014-08-14 10:12:27.400 +0200 Error:  pan_user_id_win_log_query(pan_user_id_win.c:1323): log query for dc3.cuatrocruces.com failed: [wmi/wmic.c:200:main()] ERR

OR: Login to remote object.

2014-08-14 10:12:33.392 +0200 Error:  pan_user_id_win_sess_query(pan_user_id_win.c:1471): session query for dc3.cuatrocruces.com failed: [wmi/wmic.c:200:main()

] ERROR: Login to remote object.

2014-08-14 10:12:38.473 +0200 Error:  pan_user_id_win_log_query(pan_user_id_win.c:1323): log query for dc3.cuatrocruces.com failed: [wmi/wmic.c:200:main()]

tHANKS

  014-08-14 10:12:23.783 +0200 Error:  pan_user_id_ha_pre_send(pan_user_id_ha.c:549): HA queue (0/50000) left  >>>>>>>>>>>>> This is the problem.

Is there any core file on this firewall..?

Thanks

Is this a HA Active/Active setup...?

Thanks

No, its ACTIVE/PASSIVE. these are the core files but i dont know exactly when the problem started.

admin@FW1(active)> show system files

/opt/dpfs/var/cores/:

total 4.0K

drwxrwxrwx 2 root root 4.0K Aug  6 10:11 crashinfo

/opt/dpfs/var/cores/crashinfo:

total 0

/var/cores/:

total 11M

drwxrwxrwx 2 root root 4.0K Aug  6 23:55 crashinfo

-rw-r--r-- 1 root root  11M Aug  7 00:00 useridd_6.0.3_0.tar.gz

/var/cores/crashinfo:

total 60K

-rw-rw-rw- 1 root root 54K Aug  6 23:55 useridd_6.0.3_0.info

Since this issue is related to the user-ID daemon, i would request you to open a case with PAN support. We need to take a look into the USER-ID core file.

Thanks

L4 Transporter

I have checked the MONITOR SYSTEM LOG with the core-file date (Aug  6 23:55 crashinfo) and i have found this log.

useridd:exiting because missed too many heartbeats.......i think this was the origin of the error....what could i do????

error.jpg

Apparently we arent feeling anything weird....just SNMP TRAPS abut the proble each 15minutes.

  • 4350 Views
  • 13 replies
  • 0 Likes
Like what you see?

Show your appreciation!

Click Like if a post is helpful to you or if you just want to show your support.

Click Accept as Solution to acknowledge that the answer to your question has been provided.

The button appears next to the replies on topics you’ve started. The member who gave the solution and all future visitors to this topic will appreciate it!

These simple actions take just seconds of your time, but go a long way in showing appreciation for community members and the LIVEcommunity as a whole!

The LIVEcommunity thanks you for your participation!