PAN-OS SD-WAN: KeyID Mismatch

cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
L2 Linker
No ratings

Symptoms

After a commit-push to one or some of the SD-WAN devices in a cluster some or all of the tunnels on an SD-WAN device will fail to establish and the following error is seen in the IKE or system logs:

Hub/responder IKE logs:

2025-01-21 19:54:41.680 -0800  [PERR]: {    1:     }: 172.16.20.2[500] - 192.168.20.2[500]:0x556c1e76bde0 received ID_I (type keyid [00725100036213801010072510003621330101]) does not match peers id
2025-01-21 19:54:41.680 -0800  [INFO]: {    1:     }: 192.168.20.2[500] - 172.16.20.2[500]:(nil) closing IKEv2 SA gw_0101_007251000362134_0101:74, code 13
2025-01-21 19:54:41.680 -0800  [PNTF]: {    1:     }: ====> IKEv2 IKE SA NEGOTIATION FAILED AS RESPONDER, non-rekey; gateway gw_0101_007251000362134_0101 <====

Cause

To diagnose this issue, it is important to understand how the keyID is programmed. The example used for this depiction below is a hub-spoke topology with a single hub and a branch that is in HA. However, this concept is also applicable to full mesh deployments with one or more devices in HA. 

WORKING STATE (outputs taken prior to the issue):

 

Hub:

admin@SDWAN-hub> show system state | match info.serial
sys.s1.info.serial: 007251000362133

admin@SDWAN-hub> show vpn flow

total tunnels configured:                                     3
filter - type IPSec, state any

total IPSec tunnel configured:                                3
total IPSec tunnel shown:                                     3

id    name                                                            state   monitor local-ip                                        peer-ip                                         tunnel-i/f  mode
--    --------------                                                  -----   ------- --------                                        -------                                         ----------  ----
1     tl_0101_007251000362138_0101                                    active  up      192.168.20.2                                    172.16.20.2                                     tunnel.912  tunnel
4     tl_0101_007251000362138_0102                                    active  up      192.168.20.2                                    172.16.40.2                                     tunnel.913  tunnel

admin@SDWAN-hub# show template network ike | match "local-id\|peer-id"
set network ike gateway gw_0101_007251000362138_0101 local-id id 00725100036213801010072510003621330101
set network ike gateway gw_0101_007251000362138_0101 peer-id id 00725100036213801010072510003621330101
set network ike gateway gw_0101_007251000362138_0102 local-id id 00725100036213801020072510003621330101
set network ike 

 

Branch:

admin@SDWAN-branch(active)> show system state | match info.serial
sys.s1.info.serial: 007251000362138
peer.sys.s1.info.serial: 007251000362134

admin@SDWAN-branch(active)> show vpn flow

total tunnels configured:                                     2
filter - type IPSec, state any

total IPSec tunnel configured:                                2
total IPSec tunnel shown:                                     2

id    name                                                            state   monitor local-ip                                        peer-ip                                         tunnel-i/f  mode
--    --------------                                                  -----   ------- --------                                        -------                                         ----------  ----
1     tl_0101_007251000362133_0101                                    active  up      172.16.20.2                                     192.168.20.2                                    tunnel.900  tunnel
4     tl_0102_007251000362133_0101                                    active  up      172.16.40.2                                     192.168.20.2                                    tunnel.901  tunnel

From the output above, it can be seen that the hub references the serial of the active branch device in its tunnel name convention. Likewise, the branch references the serial of the hub device in the tunnel name. Further, the keyID configuration on each peer is derived from the respective serial numbers. For example the CLI output from the hub above shows the local and peer keyID for the tunnel gateway gw_0101_007251000362138_0101 are: 00725100036213801010072510003621330101 and 00725100036213801010072510003621330101 respectively.

During auto-provisioning, the serial number of the active device in a HA deployment is extracted and programmed into the IKE gateway name, IPsec tunnel name as well as the keyID values. This statement is true for releases prior to plugin versions  2.2.5, 3.0.8, 3.1.3, 3.2.1, 3.3.0. But in these plugin versions or later, the code logic compares the serial number between a HA pair and selects the lower of the 2 serial numbers regardless of whether it is active or passive.

Because the serial is such a predictable variable associated with the keyID and other attributes of the SD-WAN tunnel, any changes to this variable (which can occur after an RMA for example) could impact tunnel establishment.

To explain further and to understand the root cause of the issue described in this article, the plugin used to manage the hub and spoke in this example has just been updated to plugin version 3.0.8. This results in a recalculation of the serial numbers to determine the lower of the 2 serials for the HA peer: meaning that the serial of the passive branch device (007251000362134) will be used for the IKE gateway name, IPsec tunnel name as well as the keyID.

After the upgrade, a commit push was executed to just the hub device: the hub was programmed with the new tunnel parameters, including a new keyID; but since there was no simultaneous commit-push to the branch, it was still using the old keyID which, in turn, led to a keyID mismatch. Post upgrade, the tunnel configurations and details on the hub are detailed below.

 

BROKEN STATE (outputs taken at the time of the issue):

 

admin@SDWAN-hub> show vpn flow

total tunnels configured:                                     3
filter - type IPSec, state any

total IPSec tunnel configured:                                3
total IPSec tunnel shown:                                     3

id    name                                                            state   monitor local-ip                                        peer-ip                                         tunnel-i/f  mode
--    --------------                                                  -----   ------- --------                                        -------                                         ----------  ----
1     tl_0101_007251000362134_0101                                    inactiv down    192.168.20.2                                    172.16.20.2                                     tunnel.912  tunnel
4     tl_0101_007251000362134_0102                                    inactiv down    192.168.20.2                                    172.16.40.2                                     tunnel.913  tunnel

admin@SDWAN-hub# show template network ike | match "local-id\|peer-id"
set network ike gateway gw_0101_007251000362134_0101 local-id id 00725100036213401010072510003621330101
set network ike gateway gw_0101_007251000362134_0101 peer-id id 00725100036213401010072510003621330101
set network ike gateway gw_0101_007251000362134_0102 local-id id 00725100036213401020072510003621330101
set network ike gateway gw_0101_007251000362134_0102 peer-id id 00725100036213401020072510003621330101

 However, from the IKE logs included in the Symptom section at the beginning of this article, it is obvious that the branch is sending a different keyID (old keyID) highlighted by the following message in the logs:

received ID_I (type keyid [00725100036213801010072510003621330101]) does not match peers id

Solution

  1. If the devices in the SD-WAN cluster have direct reachability to Panorama outside of the SD-WAN tunnels, commit-push to all devices in the cluster. This ensures that every device has the matching keyID parameters.
  2.  If there is no Panorama reachability outside of the SD-WAN tunnels, override one of the IKE gateways on either the hub or the target branch and commit the change. This will re-establish the tunnel and provide connectivity to Panorama. In the case of full mesh with only branches, override the gateway config on one of the branch devices and set the correct keyID values to match the values of the corresponding peer followed by a commit. After re-establishing connectivity to Panorama, revert the IKE gateway changes by clicking on the revert button after checking the box next to IKE gateway name but do NOT commit after reverting. Instead, perform a commit-push from Panorama. 

In this example, the issue was resolved by overriding one of the gateways on the branch with the applicable keyID values relative to the corresponding peer IKE gateway on the hub. To override, navigate to one of the applicable gateway names and click on 'override':

Network > Network Profiles > IKE Gateways

This particular solution relies on having access to the branch either by an onsite contact, remote access or through out-of-band. The other option is to override the keyID on the hub to match the branch but that is assuming the local and peer identification values on the branch are known.

It is important to note that local identification and peer identification may not always be the same value. So when setting the local and peer identification, be sure to copy the values exactly. Copy the local identification value from one gateway and then on the corresponding overriden gateway, paste this value into the peer identification field; likewise, copy the peer identification value from the gateway and paste it into the local identification field on the corresponding overriden gateway.



Additional Information

 

  • This issue can happen after RMA particularly for deployments prior to plugin version 2.2.5, 3.0.8, 3.1.3, 3.2.1, 3.3.0. For these listed plugin releases or newer the issue could be triggered if replacing one of the HA pairs with a serial that is lower than either of the pairs prior to replacement.
  • For the same plugin versions the issue can also be triggered by clearing the sdwan plugin cache. It is generally not recommended to clear the sdwan cache; however, if the cache is cleared a commit-push to all members of the cluster is mandatory.
  • When replacing a device in a HA pair refer to the SD-WAN replacement KB as well as the admin guide.

Rate this article:
  • 251 Views
  • 0 comments
  • 0 Likes
Register or Sign-in
Contributors
Article Dashboard
Version history
Last Updated:
‎01-29-2025 01:18 PM
Updated by: