Azure PA HA DNS

cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Announcements
Please sign in to see details of an important advisory in our Customer Advisories area.

Azure PA HA DNS

L1 Bithead

Hello community 🙂

I'm running an v9.1.x Active\Passive cluster on Azure and we had several problems with the "quick" failover.

Because I need the firewall(s) to perform DNS resolution on internal fqdn objects I had them configured with private DNS servers running on Azure VMs. While digging in to the failover issue I observed that the new active firewall is not able to contact Azure cloud to request the floating IPs.

"2021-12-20 17:53:01.916 +0200 vm_ha_state_trans INFO: : Getting Azure token failed with exception <urlopen error [Errno -3] Temporary failure in name resolution>
2021-12-20 17:53:01.917 +0200 vm_ha_state_trans INFO: : Failed to get Azure Access Token
2021-12-20 17:53:04.764 +0200 vm_ha_state_trans INFO: : vm_mode: 6
2021-12-20 17:53:04.926 +0200 vm_ha_state_trans INFO: : Platform Identified as AZR
2021-12-20 17:53:05.083 +0200 vm_ha_state_trans INFO: : AZR cloud_setting called
2021-12-20 17:53:05.318 +0200 vm_ha_state_trans INFO: : AZR vm_ha_trans called"

The new active firewall was sending DNS requests to internal DNS servers and the server couldn't connect to the internet because the floating IP wasn't moved to the new firewall. Similar to the chicken and the egg problem 🙂

To solve it I configured a public DNS on the firewall and kept the secondary DNS server the private one. Failover started working but I discovered that my internal FQDN object is not resolved anymore. 

Seems that PA firewall is sending DNS  queries to the second defined server only if the primary is "unreachable". I was expecting a different behavior and have DNS query sent to the secondary server even if the first returned "no record found". I changed the priority of the DNS servers and will test the failover again this Friday. 

Do you find this DNS behavior normal and how do you guys\girls have it configured in your clusters?

 

3 REPLIES 3

L0 Member

We have recently experienced the same problem to resolve this we had to create a UDR on the subnets where our DNS servers existed to ensure traffic to management IP's goes directly to management IP's. Previously this traffic was hitting our default UDR to send traffic via the trust interface.

 

When doing HA failover, the interfaces on the active device are shutdown and the passive´s plugin will run an API call to detach/attach the IPs from the active and attach them to passive before it transitions to active.
This means that if your management traffic (including DNS requests) is routed via dataplane interfaces it will get interrupted. Since the firewalls need DNS to call the Azure API it won´t work and failover won´t happen.

Thank you for sharing. Long time passed since then. Not sure how the UDR will solve the problem. How is your DNS server reaching the internet(for api call), directly or via the firewall?

Hi in our case the DNS server reaches the internet via a different firewall this is why it worked for me.

I must admit that's lucky cause I completely overlooked that part!!!

  • 2244 Views
  • 3 replies
  • 0 Likes
Like what you see?

Show your appreciation!

Click Like if a post is helpful to you or if you just want to show your support.

Click Accept as Solution to acknowledge that the answer to your question has been provided.

The button appears next to the replies on topics you’ve started. The member who gave the solution and all future visitors to this topic will appreciate it!

These simple actions take just seconds of your time, but go a long way in showing appreciation for community members and the LIVEcommunity as a whole!

The LIVEcommunity thanks you for your participation!