Layer 3 Stops Passing - All PanOS versions incl. 6.1.3

cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

Layer 3 Stops Passing - All PanOS versions incl. 6.1.3

L1 Bithead

I have opened this with TAC a while ago but I continue having issues with Layer 3 not passing through the untrust/internet interface at random times.  I have had this happen 5 to 10 times on different PA-200's.  Some have repeated.  I was hoping a firmware upgrade to 6.1.3 would finally fix this but yesterday one of my first 6.1.3 units locked up.  Layer 2 is fine.  I look in my router and the ARP entry for the PAN is in there.  I clear ARP table and it repopulates with MAC/IP as the PAN responds correctly. 


Rebooting the router doesn't do anything for the PAN to pass Layer 3 again.  The only way to get PAN to pass Layer 3 again is a reboot of the PAN itself.  We are running LSVPN on all spoke sites for VPN and the only curveball is that my hubs are on older 6.0.5h3 code.  Just throwing this out there for discussion in case others have seen it. 

13 REPLIES 13

L7 Applicator

I don't have a real solution, but you could try just restarting the routing service instead of the entire PAN device.

>debug routing restart

>debug software restart routed

Steve Puluka BSEET - IP Architect - DQE Communications (Metro Ethernet/ISP)
ACE PanOS 6; ACE PanOS 7; ASE 3.0; PSE 7.0 Foundations & Associate in Platform; Cyber Security; Data Center

L6 Presenter

Try to upgrade hubs.

panos - We have upgrades to 6.1.3 scheduled for this week.  Are there any particular fixes in there that would address what I am seeing?  I am also working on applying 6.1.3 to all spokes as well.  6.1.2 is not very stable for us.  I guess I want to make sure we're not just throwing spaghetti at the wall just because we have a pot of it.  I will try anything but if there are some known issues addressed, I'd like to know. 

Now that you are mentioning LSVPN I know that there is an issue with satellites being upgraded to 6.1.2. After upgrade satellites generate an error that the config retrieval from the portal fails and the tunnel never goes up. I suspect that this issue is also in 6.1.3.

L4 Transporter

Hi

I had same issue like You, but more often I have issue related to out os system resources  on my PA200. In my case its started on 11.2014 when I moved to 6.1.0 and after on 6.1.1 and 6.1.2

Now I'm on 6.1.3 since 2 two days. Is too early to tell is something changed.

According to support this issue will be fixed in 6.1.4 (I have case opened for this bug)

Regards

SLawek

hi dusk2dusk

Do you mind posting more details regarding instability on 6.1.2? We are planning to go from 6.0.3 to 6.1.2, may be 6.1.3, but not sure how stable these two have been so far.

what model of firewall are  you using?

Thanks

Specifically, we use GlobalProtect LSVPN to connect all 55 current remote sites, going to 70 shortly.  Under 6.1.2 on remote sites and 6.0.5h3 at hubs in datacenter we would see intermittent issues where routes to remote sites are not installed in RIB on the hub so tunnel is declared active on both sides but no traffic passing between remote and hub.  Also, on satellite remote sites we would see dataplane full lockup on layer 3.  So no routing of traffic until reboot of dataplane or entire firewall on remote sites.  Layer 2 and management are fine but that's not worth much.  .

6.1.3 on hubs and remote sites seems a great combination so far after the past week.  If you're considering 6.1.2 I would say either hold off or go to 6.1.3.  The "gold standard" is currently 6.0.8 but with LSVPN and versions we had already in production 6.1.3 was the right choice.  I'd say at this point at least 90% improvement over prior versions of code with 6.1.3.   

Thank you very much; we are not using VPNs, yet, on PaloAlto devices.

I would say that the typical manual IPsec VPNs tend to work fine without issue so it is limited to GlobalProtect prior to 6.1.3 but at this point we're solid on this version.

Keep in mind that there are A LOT of bug fixes in 6.1.3 over 6.1.2 and I would scour the list to make sure whatever way you are utilizing the PANs you are checking off the list in 6.1.3 in case there's a bug. 

One thing I went through as related, make sure you do not have any management services open to the open internet without a management ACL.  I had a problem initially where leaving it wide open, there were issues with the root filling up with failure logs etc.  I had to have PAN TAC log in and clear it.  Once I closed it down so I could only access directly from my datacenter's public IP's, we have not had another issue with resources. 

Great. Thank you, I really appreciate it.

L1 Bithead

Also to note there is a known bug with LSVPN where you can get a 'dataplane tunnel install error' which requires a total reboot of the PA-200.  The bug is 78613 and it will be fixed in the new OS version 7.0.0 released sometime in May.  I have only had one site out of my 50+ have this issue since starting on 6.1.3 code about 2-3 weeks ago.  Luckily it is not a complete dataplane lockup like I have had on previous versions and I could easily pop a reboot on this one. 

L1 Bithead

Continuing to have issues with Large-Scale VPN on 6.1.4 with 65 PA-200 satellite sites.  It will sit on reconnecting until I manually reconnect the site several times a week.

Sometimes the satellite will also lose the seed route until I pop the VPN manually, either that or the gateway will lose the route to the satellite.  I either reset the tunnel on the gateway or the satellite seems to bring it back up.  So the result is the tunnel monitor route will connect correctly so the site is maybe pingable so your monitoring will say it's up but your users will report no connectivity. 

LSVPN is supposed to be less hassle than manually setting all VPN tunnels but it just is not stable and reliable at all.  It seems to be getting a little better each release but still I get woken up almost every night with a site offline.  If somebody would guarantee 7.0.0 would fix things I would consider moving to it but I would bet it introduces more unreliability than improvements.  About half my tickets go unresolved and TAC has no idea why these things are happening.  We paid 50 grand for the Global-Protect licensing of which we do not use for remote access because JunOS Pulse is solid, only for LSVPN.  I feel like we are the only company using LSVPN, or the only one voicing the instability. 

  • 8943 Views
  • 13 replies
  • 1 Likes
Like what you see?

Show your appreciation!

Click Like if a post is helpful to you or if you just want to show your support.

Click Accept as Solution to acknowledge that the answer to your question has been provided.

The button appears next to the replies on topics you’ve started. The member who gave the solution and all future visitors to this topic will appreciate it!

These simple actions take just seconds of your time, but go a long way in showing appreciation for community members and the LIVEcommunity as a whole!

The LIVEcommunity thanks you for your participation!