- Access exclusive content
- Connect with peers
- Share your expertise
- Find support resources
12-22-2021 06:12 AM
Hello community 🙂
I'm running an v9.1.x Active\Passive cluster on Azure and we had several problems with the "quick" failover.
Because I need the firewall(s) to perform DNS resolution on internal fqdn objects I had them configured with private DNS servers running on Azure VMs. While digging in to the failover issue I observed that the new active firewall is not able to contact Azure cloud to request the floating IPs.
"2021-12-20 17:53:01.916 +0200 vm_ha_state_trans INFO: : Getting Azure token failed with exception <urlopen error [Errno -3] Temporary failure in name resolution>
2021-12-20 17:53:01.917 +0200 vm_ha_state_trans INFO: : Failed to get Azure Access Token
2021-12-20 17:53:04.764 +0200 vm_ha_state_trans INFO: : vm_mode: 6
2021-12-20 17:53:04.926 +0200 vm_ha_state_trans INFO: : Platform Identified as AZR
2021-12-20 17:53:05.083 +0200 vm_ha_state_trans INFO: : AZR cloud_setting called
2021-12-20 17:53:05.318 +0200 vm_ha_state_trans INFO: : AZR vm_ha_trans called"
The new active firewall was sending DNS requests to internal DNS servers and the server couldn't connect to the internet because the floating IP wasn't moved to the new firewall. Similar to the chicken and the egg problem 🙂
To solve it I configured a public DNS on the firewall and kept the secondary DNS server the private one. Failover started working but I discovered that my internal FQDN object is not resolved anymore.
Seems that PA firewall is sending DNS queries to the second defined server only if the primary is "unreachable". I was expecting a different behavior and have DNS query sent to the secondary server even if the first returned "no record found". I changed the priority of the DNS servers and will test the failover again this Friday.
Do you find this DNS behavior normal and how do you guys\girls have it configured in your clusters?
02-01-2024 07:28 AM
We have recently experienced the same problem to resolve this we had to create a UDR on the subnets where our DNS servers existed to ensure traffic to management IP's goes directly to management IP's. Previously this traffic was hitting our default UDR to send traffic via the trust interface.
When doing HA failover, the interfaces on the active device are shutdown and the passive´s plugin will run an API call to detach/attach the IPs from the active and attach them to passive before it transitions to active.
This means that if your management traffic (including DNS requests) is routed via dataplane interfaces it will get interrupted. Since the firewalls need DNS to call the Azure API it won´t work and failover won´t happen.
02-01-2024 08:13 AM
Thank you for sharing. Long time passed since then. Not sure how the UDR will solve the problem. How is your DNS server reaching the internet(for api call), directly or via the firewall?
02-01-2024 08:19 AM
Hi in our case the DNS server reaches the internet via a different firewall this is why it worked for me.
I must admit that's lucky cause I completely overlooked that part!!!
Click Accept as Solution to acknowledge that the answer to your question has been provided.
The button appears next to the replies on topics you’ve started. The member who gave the solution and all future visitors to this topic will appreciate it!
These simple actions take just seconds of your time, but go a long way in showing appreciation for community members and the LIVEcommunity as a whole!
The LIVEcommunity thanks you for your participation!