PANCast Episode 12: Troubleshooting IPSec Tunnels

jarena · ‎03-01-2023

Episode Transcript:

Welcome back PANCasters. IPSec is used so widely in a range of scenarios yet it is still something that causes a lot of grief both in getting new tunnels up and running and troubleshooting when there are issues with existing tunnels. Today we’ll discuss IPSec and how to troubleshoot.

About IPSecJohn Arena is a Professional Services Consultant with a background in Technical Support for Palo Alto Networks and a passion for educating and sharing knowledge with customers.

Let’s start with a look at the protocol. IPSec has been around since the mid 1990s and is used to secure traffic over normal IP networks. The classic example is you can route your traffic over public networks, like the Internet, through an IPSec tunnel to ensure the traffic is secure. Building the tunnel involves two peers that need to agree on the cryptography to use and obviously needs some sort of way to authenticate the peers. The settings on each peer need to match and the biggest single issue with new tunnels not coming up is still a mismatch in the config between the peers.

There are two phases in IPSec. The first is called phase 1, or IKE, and this is used for the two peers to authenticate and decide on cryptographic parameters along with exchanging keys. This is then used to build phase 2 which is also just called IPSec. Think of it as a control channel and then a channel for the actual data. Now this is when we are talking about IKEv1. IKEv2 is more and more common these days and although the underlying protocol is different, it still involves exchanging the cryptographic details, key generation and exchange and then building the actual tunnel, which in the case of IKEv2 is called a child SA. We won’t go into the details of the differences between IKEv1 and IKEv2 as this information is readily available but you should be looking at changing to and only using IKEv2 as IKEv1 will likely be deprecated in the near future.

So that’s the basis of IPSec. There are obviously other configuration options and we will talk about some of them now as we get into troubleshooting.

Troubleshooting issues with IPSec

There are two main issues we see with IPSec. Number one is you are building a new tunnel and it is not coming up. As I mentioned earlier, the most common cause of this is actually just a config mismatch between peers. Given it is common to be building IPSec tunnels between different organisations it can sometimes just be a config issue. But let’s talk about what to look for if you have double checked the config on both sides and it looks ok. Obviously logs on both devices is the best place to start. Best case scenario the logging will show if there is say a mismatch in the pre-shared key.

One thing to be aware of is that with IPSec there is an initiator and a responder. It could just be a timing thing to determine which peer is the initiator but you can also configure a peer to be passive. This means it will not initiate the tunnel, and needs to wait for the other end. The reason I mention this is in a lot of IPSec implementations from different vendors, the logs on the initiator may only show that the peer rejected the connection but no details as to the reason. That is why when you are troubleshooting it is better to do it from the responder side as it should have the details as to why the connection is rejected.

The second scenario is where you do have an established tunnel but there is an outage. That could be either the tunnel going down causing impact or even when traffic through the tunnel is affected even though your logs show the tunnel did not go down. To troubleshoot these is actually the same as scenario one, the logs. This is where you need to start but what I do want to discuss is a couple of aspects of IPSec that can be relevant to these sorts of issues.

IPSec - DPD and tunnel monitoring

Number one is understanding that IPSec is ultimately a negotiation to secure traffic between two devices but after the session has been established, there is no acknowledgement process. So here's what happens. Device A and Device B set up a tunnel between each other and it comes up and it works ok. The tunnel will have a current Security Association which identifies the current parameters for the IPSec tunnel. Every time it re-keys, a new Security Association is created. For Device A what this means is that for the tunnel lifetime, let's say 1 hour, it will encrypt any packets based on the current Security Association and just send those packets out. Device B does exactly the same in the opposite direction. While the sessions going through the IPSec tunnel may have an acknowledgement mechanism, the actual tunnel does not so if Device B suddenly shuts down, Device A will happily continue sending the IPSec packets. The only time the issue would be known is when it is time to re-key, then obviously the re-key would fail and the tunnel would go down. This is the normal behaviour and is exactly why we have tunnel monitoring and DPD, short for Dead Peer Detection. Both of these, if configured, can test connectivity. Testing to the remote end using DPD and whether traffic through the tunnel is actually working by using tunnel monitoring. I imagine there are a lot of tunnels configured that have just worked for months, even years without these configured but if you have important traffic running across the tunnels it is worth looking at these two features.

There is also another scenario where tunnel monitoring and DPD can help and that is on the off chance you get a state where a Security Associations is mismatched. Not common but it can happen when Device A is sending packets with a different Security Association than what Device B is expecting and the packets are therefore dropped. Again tunnel monitoring can help with this.

There are some things to note with tunnel monitoring and DPD however. You do need to make sure the configuration is correct and on both sides. Tunnel monitoring is simply a ping through the tunnel so the address being monitored is also important. If the address you choose may not always be reachable then you’ll have the opposite where the tunnel could drop when it is actually ok. And the final thing on monitoring is what actually happens in case of a tunnel monitoring failure. It is all well and good knowing there is an issue but what should happen? Should the device try and re-key straight away and try and restore the tunnel? Should it drop the tunnel and potentially change routes so as to failover to a backup path. All things that must be considered.

NAT Traversal and peer identification

Just a couple of final quick notes on some features that you should also consider when things don’t work. Firstly NAT traversal. IPSec does not cope well with Network Address Translation which is why NAT traversal is an available configuration option. You will need this enabled if NAT is involved between the peers. For example your device that is terminating the IPSec tunnel uses private addressing and an upstream device is doing the NAT. NAT Traversal should be enabled on both sides.

Somewhat related is local and remote peer identification. There are scenarios where you need to configure the local and remote peer IDs. Using private IPs as mentioned is one scenario and when you have a device that may have a dynamic IP address that could change is another scenario. What is important is that the local and remote peer IDs are there and matching as if not, the tunnel will not come up.

Right, that’s it for IPSec and troubleshooting. A bit to digest but please remember that in a lot of cases where a new tunnel is not coming up, it could just be a config issue. If it is not though, hopefully now you have some additional knowledge to be able to try and find the root cause.

Remember to head to live.paloaltonetworks.com for the transcript and related articles and please also remember our new ideas forum is now available. If you have a topic you think would be good to do an episode on, let us know and we’ll try to get it done. Bye for now.

Check out the full PANCast YouTube playlist: PANCast: Insights for Your Cybersecurity Journey.

Related Content:

NGFW

Unlock your full community experience!

PANCast Episode 12: Troubleshooting IPSec Tunnels