Dead Peer Detection and Tunnel Monitoring

Printer Friendly Page

Overview

Dead Peer Detection (DPD) refers to functionality documented in RFC 3706, which is a method of detecting dead Internet Key Exchange (IKE/Phase1) peers. Tunnel Monitoring is a Palo Alto Networks proprietary feature that verifies traffic is successfully passing across the IPSEC tunnel in question by sending a PING down the tunnel to the configured destination. Tunnel monitoring can be used in conjunction with “Monitor Profiles” to bring down the tunnel interface allowing routing to update to allow traffic to route across secondary routes. Tunnel monitoring does not require DPD. Dead Peer Detection must be either active or disabled on both sides of the tunnel, having one side with DPD enabled and one side with it disabled can cause VPN reliability issues.

 

Details

Dead Peer Detection

DPD is a monitoring function used to determine liveliness of the Security-SA (Security Association and IKE, Phase 1)

 

DPD is used to detect if the peer device still has a valid IKE-SA. Periodically, it will send a “ISAKMP R-U-THERE” packet to the peer, which will respond back with an “ISAKMP R-U-THERE-ACK” acknowledgement.

 

The Palo Alto Networks does not currently have a log associated with DPD packets, but can be detected in a debug packet capture. The following is a PCAP from a peer device:

 

Mar  4 14:32:36 ike_st_i_n: Start, doi = 1, protocol = 1, code = unknown (36137), spi[0..16] = cd11b885 588eeb56 ..., data[0..4] = 003d65fc 00000000 ...
Mar  4 14:32:36 DPD; updating EoL (P2 Notify
Mar  4 14:32:36 Received IKE DPD R_U_THERE_ACK from IKE peer: 169.132.58.9
Mar  4 14:32:36 DPD: Peer 169.132.58.9 is UP status_val: 0.

 

The DPD query and delay interval can be configured when DPD is enabled on the Palo Alto Networks device. DPD will tear down the SA once it realizes the peer is no longer responding.

Screen Shot 2013-05-06 at 1.06.46 PM.png

Note: The DPD is "not persistent" and is only triggered by a Phase 2 rekey. This means if Phase 2 is up, Palo Alto Networks will not check to see if IKE-SA is active. To get Phase 2 to trigger a rekey, and trigger the DPD to validate the Phase 1 IKE-SA, enable tunnel monitoring.

 

Tunnel Monitoring

Tunnel Monitoring is used to verify connectivity across an IPSEC tunnel. If a tunnel monitor profile is created it will specify one of two action options if the tunnel is not available: Wait Recover or Fail Over.

  • Wait Recover tells the firewall to wait for the tunnel to recover and not take additional action
  • Fail Over will force traffic to a back-up path if one is available

In both cases, the firewall will try to negotiate new IPSec keys to accelerate the recovery.
Screen Shot 2013-05-06 at 2.23.48 PM.png

A threshold option can be set to specify the number of heartbeats to wait before taking the specified action. The range is between 2 and 100 and the default is 5. The interval between heartbeats can also be configured. The range is between 2 and 10 and the default is 3.

 

Once the tunnel monitoring profile is created, as shown below, select it and enter the IP address of the remote end to be monitored.

Screen Shot 2013-05-06 at 2.28.11 PM.png

 

owner: panagent

Comments

So what will happen if the monitor detects that the tunnel is down? Are there any alerts or notifications?

If the tunnel goes down a tunnel-status-down system log is generated and the firewall takes the action that is given in the Tunnel monitor network profile.

If it is failover it will failover to the other tunnel if configured, if it is wait-recover, it will just continue to send pings to the monitored IP to determine when the ip comes back.

Any idea what interface/routing table(vr) this gets sourced from?

Is there also any requirement on the tunnel interface having an IP?

onguard,

1. The DPD traffic would generally source from the same interface and virtual router that is configured for the IPSEC gateway.

2. Tunnel interfaces do not need an IP address to function.  However, if you want to use any of the advanced features, like tunnel monitoring than you would need an IP address assigned to the interface so the system has an IP to source the tunnel monitor traffic from.

I just have a clarification question here...is the Destination IP for the monitor the remote Peer, or an IP address of a system on the other side within the SA of the tunnel?

tpratt - Tunnel Monitor Destination IP should be an IP that will route down the tunnel to the remote end.  It can be the egress interface of the remote ends VPN termination point or any IP downstream of that.  So #2 IP address of a system on the other side within the SA of the tunnel.

I have an established and working IPSec tunnel (in passive mode if it matters for this case). If I add a tunnel monitor (fail-over type) to it and commit policy, the tunnel interface will go down and won't come up. I have to clear IKE phase (and re-establish IPSec tunnel) for tunnel monitor to show as up. Is this expected? Imo it shouldn't work like that.

I have same problem with firmware 6.1.4, after the commit the vpn tunnel interface will go down and won't come up.

Yeah, I noticed this on 6.1.4 as well.

Is there a way to setup an email or SNMP-TRAP alert if a selected (monitored) tunnel goes down? In other words we have select tunnels that remain operational all the time but will cause issues if they go down. it would be great if we had the ability to setup an email alert or have it trigger and SNMP TRAP for a separate resource to generate a page/email.

What exactly happens when Tunnel Monitor is down and Profile is set to Fail-over. What is the Fail-over mechanism exactly ?

Fail-over profile will set the tunnel interface as down. That will make all routes which use this interface inactive in routing table. So it's useful in case when you have multiple routes for same network with different priorities for example.

We are on 6.1.10 and I'm having the same issue. I put an IP address on a tunnel interface e.g. 10.77.128.1 and I can ping the remote end THROUGH the tunnel 10.50.252.252 using 10.77.128.1 as my source address. I then enable tunnel montoring to 10.50.252.252 and the tunnel goes down (tunnel interface status turns RED). Surely if I tested this with a ping before, this should work? Why does the tunnel go down (phase I and phase II are UP by the way and GREEN). So why is the tunnel status RED?

 

Also, to get this working again I have to delete the IPsec tunnel and IKE Gateway configuraiton and rebuild it. It does not recover automatically when I remove the tunnel monitoring.

 

Major bugs I think.

I have same issue with PAN OS 7.1.4-h and don't think this version specific.

However  by increasing the ike timeout value to 8 hours, I could minimize the frequency of the issue.

 

Other intersting thing I noiced is, tunnel go down only when phase1 negotiated as responder. Everytime it negotiated as initiator it is working wiht no issues.

 

Is there anyway I can make my side always be the initiator ?

 

Hi Nalin,

 

Yes, you can make the intended side as always Initiator, please make sure that you have configured exchange mode as "Aggressive" and peer should be configured with "Enable Passive Mode (to make sure that peer should never fire the IPSEC VPN, it should be responder always), hence the site configured as aggressive will always be the initiator.

 

Best Regards,

 

Fozail

I have recently noticed on couple of occasions that if there is no tunnel monitor IP VPN wouldn't come up.  Even if I clear it from CLI and it shows green Phase 1 and Phase 2 it would not pass traffic.  As soon as I add tunnel monitor IP it starts to pass traffic.

 

Weird thing is it was working before without the tunnel monitor IP.  Is this a bug or a new feature?

hi, i am planning to configure tunnel monitor for DR setup along with tunnel monitor ip address.

let us say i am taking /30 ip space for monitor. 10.10.10.1 (loca ip monitor ip) 10.10.10.2 as remote tunnel monitor ip.

pls confirm where we need to advertise exactly the destination ip 10.10.10.2 at destination end ?

I have setup Tunnel Monitor and email system alert; I have got email alert for tunnet when went down with event ID eq tunnel-status-down ; the tunnel came back up but but didn't see no event in the log like  "eventid eq tunnel-status-up " so I can setup a email alert on it that shows the tunnel has recovered. there were logs of tunnet phases negotiation " multipe p2 phase success" 

I can setup the recover alert for p2 phase sucess but I will receive ton of emails for it ;  I was hoping to see eventid eq tunnel-status-up  ; please confirm if such event exist or not or there is a better way to alert on tunnel recover ?

I set up tunnel monitoring on each end.  One side 5060 and other 3020.  Dual ISP's on the 5060 and single on the 3020.  Tunnel monitoring on each Palo is set to fail over.  The 5060 has 2 routes for each tunnel and the 3020 has 2 routes for each tunnel.  The connectivity was working and we had an ISP event on ISP1.  The monitoring broke and has never recovered but the tunnel is up.  What makes the monitoring re-establish.  Our primary ISP is up.

If I have 3 sites, and 6 tunnels (2 per site, connecting to the other 2 sites), and use static routing, can I use failover to move traffic from the down'ed interface to the other interface and would the next hop know enough to send the traffic to the destination PA?

 

-L

hi @Ambidexter

 

you'll need to set the monitor profile to fail-over (versus wait-recover) to bring the tuinnel down, and then have a secondary route with a higher metric to redirect sessions over the second tunnel. This will enable routing over the other peer in case the direct tunnel goes down. you will need to cxreate security policy to allow connections between the 2 vpn zones

I have 3 IPSec tunnels setup between our palo altos and tunnel monitoring is enabled on all of them. We configured to receive email alerts via Splunk everytime when a Tunnel Down event occurs on the firewall. We have been receiving a bunch of emails that the tunnel is down everytime there is a peiodic rekeying on phase 2. There is not traffic loss or anything but most of these are false alarms and very frequent. 

Not sure if this is expected to happen but this seem to appear on only a few versions of PANOS. We are currently running 8.0.12 and we saw the same issue when we were running Panos 7.