For the deployments I've overseen for my customers, most actually went with a dedicated deployment model (1 NGFW for egress, 1 for ingress). This is because Azure is VM-based, so spinning up a passive instance and actively shaping traffic to it does take time as noted, and no public cloud provider puts an SLA on their API calls. I've seen up to 40 minutes, before. Please see our technical documentation on this here. The load balancer sandwich allows for horizontal scaling, if you need additional bandwidth/compute resources to scale up. For example, you could active/active an ingress pair for your requirement of "a firewall ready to move traffic during upgrades." See our detailed template for this deployment here. Basically you just need to add interface profiles to each untrust/trust interface allowing ping access for the health polling in Azure. Then you would write rules of 0.0.0.0/0 next hop out the untrust interface and the same for trust. In security policy you would specify which applications, users, IP addresses, etc are allowed to send what traffic where. The NAT should be handled by your external application load balancer.
... View more