FQDN refresh problems

cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Announcements
Please sign in to see details of an important advisory in our Customer Advisories area.

FQDN refresh problems

L1 Bithead

Hell guys,

We have a problem that the FQDN refresh fails nearly everytime. What I mean with "nearly" everytime is, that there are periods in which the FQDN refresh is running smoothly, and then suddenly it fails again.

Example: A few days ago the FQDN refresh failed for the 365th time. We then disabled the DNS Proxy because there was something in the logs regarding DNS proxy and then it worked again. Two days later, out of nothing, the FQDN refreshes started failing again. Today early morning at 00:37 the FQDN refresh worked again, and later at 10:12 it started to fail again. No changes in between. (except dynamic updates from PAN)

Below you will find the MP logs (manual FQDN refresh). It can resolve all FQDN, but then somehow fails.

Does anyone have the same problems? Any tips?

 

Thanks in advance and best regards

 

2018-10-29 09:15:08.416 +0100 debug: pan_dnscfg_recv_resp(pan_cfg_dnscfg.c:550): dnscfgmod: response received 12, to resolve 12
2018-10-29 09:15:08.416 +0100 debug: pan_dnscfg_recv_resp(pan_cfg_dnscfg.c:559): dnscfgmod: All Fqdns responses received
2018-10-29 09:15:08.416 +0100 debug: pan_dnscfg_resolve_now(pan_cfg_dnscfg.c:3417): dnscfgmod: Done timedwait
2018-10-29 09:15:08.416 +0100 dnscfgmod: Resolving fqdns took 1 secs
2018-10-29 09:15:08.416 +0100 Fqdn refresher thread device requested last config
2018-10-29 09:15:08.429 +0100 debug: pan_cfg_populate_candidate_ids(pan_cfg_commit_jobs.c:338): id population started
2018-10-29 09:15:11.904 +0100 debug: pan_comm_lcs_get_next_addr(cs_conn.c:4770): connmgr:  >>> pan_comm_lcs_get_next_addr()
2018-10-29 09:15:12.024 +0100 debug: pan_lcsa_tcp_connect_pref_list(src_panos/lcs_agent.c:1488): pan_fd_watcher_wait, time elapsed = 30000
2018-10-29 09:15:12.024 +0100 debug: pan_lcsa_tcp_connect_pref_list(src_panos/lcs_agent.c:1497): timedout from watcher
2018-10-29 09:15:12.024 +0100 debug: pan_lcsa_tcp_connect_pref_list(src_panos/lcs_agent.c:1458): get MIN earliest timeout from conn & pref list = 30000
2018-10-29 09:15:12.179 +0100 Populating id readonly: size: 16777216 nofree_chunk, allocated 3 chunks, chunk size 16777216
alloc usage 46829131 max. usage 46829131
2018-10-29 09:15:12.189 +0100 shm alloc(read-only) 'pan_shm_base' size 104172048
2018-10-29 09:15:12.189 +0100 debug: pan_config_populate_vrouter(pan_config_handler.c:342): populate vrouter name = default found with id 1
2018-10-29 09:15:12.196 +0100 Warning:  pan_config_populate_global_app(pan_config_handler.c:631): application active-directory has the same id 4 as active-directory-base.
2018-10-29 09:15:12.197 +0100 Error:  pan_shmgr_add_mmap(pan_shmgr.c:1467): readonly shmgr, can't create id
2018-10-29 09:15:12.197 +0100 Error:  __pan_shmgr_preprocess_obj_type(pan_shmgr.c:1630): func(type 9) failed: Error: Error inserting shared-application id for 'active-directory' (4)

2018-10-29 09:15:12.197 +0100 Error:  __pan_shmgr_preprocess_obj_type(pan_shmgr.c:1673): __pan_shmgr_preprocess_obj_type(entry:i) failed
2018-10-29 09:15:12.197 +0100 Error:  __pan_shmgr_preprocess_obj_type(pan_shmgr.c:1673): __pan_shmgr_preprocess_obj_type(application) failed
2018-10-29 09:15:12.197 +0100 Error:  pan_config_populate_id_readonly(pan_config_handler.c:794): pan_config_preprocess_obj() failed
2018-10-29 09:15:12.221 +0100 Error:  pan_cfg_populate_candidate_ids(pan_cfg_commit_jobs.c:342): Unable to populate ids into the config:
Error: Error inserting shared-application id for 'active-directory' (4)

2018-10-29 09:15:12.224 +0100 Error:  pan_cfg_get_lastcfg_fqdnrefresh(pan_cfg_commit_jobs.c:753): Unable to populate ids into candidate config:
Error: Error inserting shared-application id for 'active-directory' (4)

2018-10-29 09:15:12.224 +0100 Error:  pan_cfg_dnscfg_refresh_fqdns(pan_cfg_dnscfg.c:4416): dnscfgmod: Failed to get valid last config for fqdn refresh
2018-10-29 09:15:12.225 +0100 debug: _pan_cfg_run_query_match(pan_log_handler.c:4776): query match using fsm at index: 0
2018-10-29 09:15:12.225 +0100 debug: pan_logfwd_syslog_handler(pan_logforward.c:1181): Running pan_logfwd_syslog_handler() ...
2018-10-29 09:15:12.225 +0100 debug: pan_logfwd_syslog_handler(pan_logforward.c:1209): subtype: 0
2018-10-29 09:15:12.225 +0100 debug: pan_logfwd_syslog(pan_syslog.c:866): Running pan_logfwd_syslog() ...
2018-10-29 09:15:12.225 +0100 debug: pan_logfwd_email_handler(pan_logforward.c:1669): Running pan_logfwd_email_handler() ...
2018-10-29 09:15:12.225 +0100 debug: pan_logfwd_email_handler(pan_logforward.c:1681): subtype: 0
2018-10-29 09:15:12.225 +0100 debug: pan_get_vsys_dns_from_sysd_obj(pan_dnsproxyd_sysd_api.c:778): DNS_API - sysd 0x1088a700 - vsysname 0x1932182c
2018-10-29 09:15:12.226 +0100 Error:  pan_jobmgr_process_job(pan_job_mgr.c:3274): Fqdn Refresh job failed
2018-10-29 09:15:12.226 +0100 debug: _pan_cfg_run_query_match(pan_log_handler.c:4776): query match using fsm at index: 0
2018-10-29 09:15:12.226 +0100 debug: pan_logfwd_syslog_handler(pan_logforward.c:1181): Running pan_logfwd_syslog_handler() ...
2018-10-29 09:15:12.226 +0100 debug: pan_logfwd_syslog_handler(pan_logforward.c:1209): subtype: 0
2018-10-29 09:15:12.226 +0100 debug: pan_logfwd_syslog(pan_syslog.c:866): Running pan_logfwd_syslog() ...
2018-10-29 09:15:12.462 +0100 debug: pan_logfwd_email_handler(pan_logforward.c:1669): Running pan_logfwd_email_handler() ...
2018-10-29 09:15:12.462 +0100 debug: pan_logfwd_email_handler(pan_logforward.c:1681): subtype: 0
2018-10-29 09:15:12.462 +0100 debug: pan_get_vsys_dns_from_sysd_obj(pan_dnsproxyd_sysd_api.c:778): DNS_API - sysd 0x1088a700 - vsysname 0x110d112c
2018-10-29 09:15:15.463 +0100 debug: pan_cfg_handle_op(pan_cfg_op_handler.c:1621): operational cmd target:mgmt complete?:no handler:show_session_timeout_handler
2018-10-29 09:15:15.463 +0100 debug: pan_cfg_execute_mgmtop(pan_ops_common.c:38401): OPCMD show_session_timeout_handler
2018-10-29 09:15:15.629 +0100 debug: _pan_cfg_exec_ctxt_init(pan_cfg_engine.c:2177): effective xpath is:/config/devices/entry[@name='localhost.localdomain']/deviceconfig/system/update-schedule/statistics-service/threat-prevention-information

2 REPLIES 2

Cyber Elite
Cyber Elite

@ErrantOsi,

The refresh does a bit of work in the background, since it actually has to input the IPs that resolved to the FQDN objects into the functioning rulebase of the device. Nothing really looks alarming in your output until you get to the following line:

2018-10-29 09:15:12.196 +0100 Warning:  pan_config_populate_global_app(pan_config_handler.c:631): application active-directory has the same id 4 as active-directory-base.

Since that's failing the system can't actually build the functional config of the device, as further evident in the below error.

2018-10-29 09:15:12.221 +0100 Error:  pan_cfg_populate_candidate_ids(pan_cfg_commit_jobs.c:342): Unable to populate ids into the config:
Error: Error inserting shared-application id for 'active-directory' (4)

Since it can't build a functional configuration it can't very well populate the fqdn resolved IPs into said cconfiguration. 

2018-10-29 09:15:12.224 +0100 Error:  pan_cfg_dnscfg_refresh_fqdns(pan_cfg_dnscfg.c:4416): dnscfgmod: Failed to get valid last config for fqdn refresh

 

For something like this I would highly recommend reaching out to TAC and opening a case with them, so they can see exactly what's going on. The bigger question for me would actually be why it's working at all; as if you have an issue with the application base you wouldn't expect that to 'fix' itself. 

Thanks for your answer.

I searched for the active-directory application and saw that we have one rule with it.

For testing I deleted the application in the rule and then the FQDN resolve started to work. I thought "ok, weird that this solved it". For testing again, I added the active-directory application to the exact same rule and to my surprise, the FQDN refresh is still working.... So in the end it solved the problem for now but at the same time it was not the trigger. I will monitor it if the refresh will start to fail again.

  • 2695 Views
  • 2 replies
  • 0 Likes
Like what you see?

Show your appreciation!

Click Like if a post is helpful to you or if you just want to show your support.

Click Accept as Solution to acknowledge that the answer to your question has been provided.

The button appears next to the replies on topics you’ve started. The member who gave the solution and all future visitors to this topic will appreciate it!

These simple actions take just seconds of your time, but go a long way in showing appreciation for community members and the LIVEcommunity as a whole!

The LIVEcommunity thanks you for your participation!