cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Announcements
Please sign in to see details of an important advisory in our Customer Advisories area.

Who Me Too'd this topic

Azure PA HA DNS

L1 Bithead

Hello community 🙂

I'm running an v9.1.x Active\Passive cluster on Azure and we had several problems with the "quick" failover.

Because I need the firewall(s) to perform DNS resolution on internal fqdn objects I had them configured with private DNS servers running on Azure VMs. While digging in to the failover issue I observed that the new active firewall is not able to contact Azure cloud to request the floating IPs.

"2021-12-20 17:53:01.916 +0200 vm_ha_state_trans INFO: : Getting Azure token failed with exception <urlopen error [Errno -3] Temporary failure in name resolution>
2021-12-20 17:53:01.917 +0200 vm_ha_state_trans INFO: : Failed to get Azure Access Token
2021-12-20 17:53:04.764 +0200 vm_ha_state_trans INFO: : vm_mode: 6
2021-12-20 17:53:04.926 +0200 vm_ha_state_trans INFO: : Platform Identified as AZR
2021-12-20 17:53:05.083 +0200 vm_ha_state_trans INFO: : AZR cloud_setting called
2021-12-20 17:53:05.318 +0200 vm_ha_state_trans INFO: : AZR vm_ha_trans called"

The new active firewall was sending DNS requests to internal DNS servers and the server couldn't connect to the internet because the floating IP wasn't moved to the new firewall. Similar to the chicken and the egg problem 🙂

To solve it I configured a public DNS on the firewall and kept the secondary DNS server the private one. Failover started working but I discovered that my internal FQDN object is not resolved anymore. 

Seems that PA firewall is sending DNS  queries to the second defined server only if the primary is "unreachable". I was expecting a different behavior and have DNS query sent to the secondary server even if the first returned "no record found". I changed the priority of the DNS servers and will test the failover again this Friday. 

Do you find this DNS behavior normal and how do you guys\girls have it configured in your clusters?

 

Who Me Too'd this topic