I've run into an interesting issue and I'm hoping someone here may have some previous experiences or maybe something on best practices I'm missing.
Basically, we have a site-to-site loopback interface set up and we have several tunnels that utilize this and each connects to its own security zone. So far, this has all been working as far as I know. We now have yet another tunnel that utilizes this same type of setup on the same loopback as the others, however, the team that uses it has noticed issues with the applications that go across it... namely that some transactions complete and some do not. Looking at the session logs, I can see a number of tcp-fin but also some aged-out and some tcp server resets.
We got on a call with the team that manages the network/servers on the remote side and found that lowering the MTU on the servers to 1400 seems to resolve it and all transactions work correctly at that point. I'm fairly certain that, on our side of the bridge, the DC networking hardware has jumbo frames configured and so does the datacenter interface on the firewall itself. They wouldn't hit a smaller MTU until the traffic starts to traverse the site-to-site. I don't know what hardware that the remote side uses to terminate or to carry the traffic to the servers.
The site-to-site loopback on our side looks like it is configured with default MTU and Adjust TCP MSS is not configured. The tunnel interface for this particular site-to-site is also using default MTU. I'm guessing I need to either adjust the MTU on the loopback/tunnel (if I have to adjust on the loopback, I wonder how this will impact all of the other tunnel interfaces also utilizing it) or turn on the TCP MSS adjustment?
@BPry thanks for the reply! I'm assuming then that the defaults of 1500 should be fine on the Palo?
This whole thing really seems odd to me because I figured that the Palo must be taking in any overhead into account since it is the one encapsulating for the IPSEC VPN. Furthermore, should the packets not just get fragmented and then re-assembled by the remote client if the do exceed the MTU?
I ended up opening a support case on this. The Palo engineer didn't see anything wrong with my configuration and didn't think the TCP MSS adjustment should be necessary.
I got on a support call with the vendor that we're connecting to. I asked what their MTU was set to.. I'm not sure that they ever found out for sure while we were on the phone but they suggested it was probably set to 1420. We have jumbo frames set up on our fireall and the loopback and tunnel on our side was just using defaults with no specific MTU set. The computer on our side would have been using 1500 per defaults but it also looks like it had Do Not Fragment set.
They seemed to think our firewall was still fragmenting and/or dropping despite the Do Not Fragment flag but Palo saw no evidence of those things occurring on the stats.
I ended up just lowering the tunnel MTU on our side down to 1400 which seems to have resolved the issue. To be fair, we've used this setting on several other site-to-sites and I'm not sure why I didn't set it here except possibly on those other tunnels we got instructions on what to set everything at as part of the vendor setup. Several of us remember them saying 1500 should be fine so who knows.
Click Accept as Solution to acknowledge that the answer to your question has been provided.
The button appears next to the replies on topics you’ve started. The member who gave the solution and all future visitors to this topic will appreciate it!
These simple actions take just seconds of your time, but go a long way in showing appreciation for community members and the LIVEcommunity as a whole!
The LIVEcommunity thanks you for your participation!