Site to site VPN between Azure and VM300 - SQL replication slow

cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

Site to site VPN between Azure and VM300 - SQL replication slow

L1 Bithead

Hi folks, I'm facing some throughput issues with a site to site vpn between my onprem site (vm-300) and azure (VpnGw1).

Scenario:

- Windows cluster + SQL Always on Availability Groups (async commit)

- 2 nodes on premises (sql01 and sql02)

- 1 node on azure (sql03).

- Link speed 150Mbps

- Latency between on prem and azure: 15ms

 

Ipsec tunnel is working, running some generic tests (iperf and smb copies) the throughput hits:

on-prem to azure: 80Mbps 

azure to on-prem: 150Mbps

The issue is when SQL trying to replicate.

The sql01 is my primary, so it is the one who replicate data to secondaries (sql02 and sql03)

Throughput replication from sql01 to sql02 it's around 2.5Mbps (lan connection)

Throughput replication from sql01 to sql03 it's around 1Mbps. (which goes through the vpn).

 

Selection_112.png

 

Changes made:

- Tunnel MTU to 1400

- Disable Anti replay protect

 

I did some captures with packet-capture and I could observe high TCP out-of-order and TCP Previous segment not captured.

 

Hope some one could help me.

 

1 accepted solution

Accepted Solutions

L1 Bithead

Hello everyone, after some weeks of analyzes and debug finally we solved the problem.
Due different disk sector size on azure VMs, 512 bytes on premises and 4k on azure, we must enable sql trace flag 1800 on on-premises VMs.
After that the sql replication is working like a charm.

Find below KB about this issue.
https://support.microsoft.com/en-us/topic/kb3009974-fix-slow-synchronization-when-disks-have-differe...

View solution in original post

6 REPLIES 6

Cyber Elite
Cyber Elite

@infrags,

SQL replication kind of hates latency, but inspecting it can also cause serious delays. Do you have a need to inspect the replication traffic? If you do, are you inspecting it on just one or both firewalls? 

Yes.

I did on security rule by checking Disable Server Response Inspection, also I have create an application override for mssql server always on port (5022).

Cyber Elite
Cyber Elite

Hello,

Instead of app overrides, I would just configure a security policy to allow the traffic, source ip/destination ip, with no inspection enabled. This way you get the same results. The other idea I was kicking around was to reduce the MTU on both sides.

 

Just some thoughts.

Hi, yes I did configure this at security policy.

Unfortunately I can't edit mtu on azure side, cause I'm using azure native virtual gateway.

 

 

Cyber Elite
Cyber Elite

Hello,

Then MTU resizing wont help out. I would say set the PAN MTU size on the tunnel to whatever Azure has theirs set to. Sorry I could be much more help.

 

Regards,

L1 Bithead

Hello everyone, after some weeks of analyzes and debug finally we solved the problem.
Due different disk sector size on azure VMs, 512 bytes on premises and 4k on azure, we must enable sql trace flag 1800 on on-premises VMs.
After that the sql replication is working like a charm.

Find below KB about this issue.
https://support.microsoft.com/en-us/topic/kb3009974-fix-slow-synchronization-when-disks-have-differe...

  • 1 accepted solution
  • 4595 Views
  • 6 replies
  • 1 Likes
Like what you see?

Show your appreciation!

Click Like if a post is helpful to you or if you just want to show your support.

Click Accept as Solution to acknowledge that the answer to your question has been provided.

The button appears next to the replies on topics you’ve started. The member who gave the solution and all future visitors to this topic will appreciate it!

These simple actions take just seconds of your time, but go a long way in showing appreciation for community members and the LIVEcommunity as a whole!

The LIVEcommunity thanks you for your participation!