- Access exclusive content
- Connect with peers
- Share your expertise
- Find support resources
11-16-2021 06:08 AM - edited 11-16-2021 12:46 PM
Hi folks, I'm facing some throughput issues with a site to site vpn between my onprem site (vm-300) and azure (VpnGw1).
Scenario:
- Windows cluster + SQL Always on Availability Groups (async commit)
- 2 nodes on premises (sql01 and sql02)
- 1 node on azure (sql03).
- Link speed 150Mbps
- Latency between on prem and azure: 15ms
Ipsec tunnel is working, running some generic tests (iperf and smb copies) the throughput hits:
on-prem to azure: 80Mbps
azure to on-prem: 150Mbps
The issue is when SQL trying to replicate.
The sql01 is my primary, so it is the one who replicate data to secondaries (sql02 and sql03)
Throughput replication from sql01 to sql02 it's around 2.5Mbps (lan connection)
Throughput replication from sql01 to sql03 it's around 1Mbps. (which goes through the vpn).
Changes made:
- Tunnel MTU to 1400
- Disable Anti replay protect
I did some captures with packet-capture and I could observe high TCP out-of-order and TCP Previous segment not captured.
Hope some one could help me.
11-29-2021 05:11 AM
Hello everyone, after some weeks of analyzes and debug finally we solved the problem.
Due different disk sector size on azure VMs, 512 bytes on premises and 4k on azure, we must enable sql trace flag 1800 on on-premises VMs.
After that the sql replication is working like a charm.
Find below KB about this issue.
https://support.microsoft.com/en-us/topic/kb3009974-fix-slow-synchronization-when-disks-have-differe...
11-16-2021 05:20 PM
SQL replication kind of hates latency, but inspecting it can also cause serious delays. Do you have a need to inspect the replication traffic? If you do, are you inspecting it on just one or both firewalls?
11-17-2021 05:31 AM
Yes.
I did on security rule by checking Disable Server Response Inspection, also I have create an application override for mssql server always on port (5022).
11-17-2021 11:28 AM
Hello,
Instead of app overrides, I would just configure a security policy to allow the traffic, source ip/destination ip, with no inspection enabled. This way you get the same results. The other idea I was kicking around was to reduce the MTU on both sides.
Just some thoughts.
11-18-2021 04:50 AM
Hi, yes I did configure this at security policy.
Unfortunately I can't edit mtu on azure side, cause I'm using azure native virtual gateway.
11-18-2021 09:12 AM
Hello,
Then MTU resizing wont help out. I would say set the PAN MTU size on the tunnel to whatever Azure has theirs set to. Sorry I could be much more help.
Regards,
11-29-2021 05:11 AM
Hello everyone, after some weeks of analyzes and debug finally we solved the problem.
Due different disk sector size on azure VMs, 512 bytes on premises and 4k on azure, we must enable sql trace flag 1800 on on-premises VMs.
After that the sql replication is working like a charm.
Find below KB about this issue.
https://support.microsoft.com/en-us/topic/kb3009974-fix-slow-synchronization-when-disks-have-differe...
Click Accept as Solution to acknowledge that the answer to your question has been provided.
The button appears next to the replies on topics you’ve started. The member who gave the solution and all future visitors to this topic will appreciate it!
These simple actions take just seconds of your time, but go a long way in showing appreciation for community members and the LIVEcommunity as a whole!
The LIVEcommunity thanks you for your participation!