Upgrading heavily used PaloAlto Firewalls

cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Announcements

Content translations are temporarily unavailable due to site maintenance. We apologize for any inconvenience. Visit our blog to learn more.

Upgrading heavily used PaloAlto Firewalls

L2 Linker

Hi,

 

I am about upgrade some PaloAlto firewalls with 10s of Vsys but wondering what would be a good report to generate to identify traffic flows for pre and post checks as well as identifying impact to services.

 

any help will be appreciated.

 

Regards,

 

2 accepted solutions

Accepted Solutions

@qasim02 

 

As @BPry  mentioned that is the best way to verify that applications are working.

We have so many public facing applications in our PA and whenever I do any PAN OS upgrade I send email to application team 

that upgrade is done and then they verify  that applications  are working fine.

 

PA you can only see the session numbers in the GUI and traffic logs showing traffic is passing for those applications and assume all

is good.

 

Regards

MP

Help the community: Like helpful comments and mark solutions.

View solution in original post

@qasim02,

Again, a report really isn't going to tell you or the other teams much of anything. It'll tell you what the top source or destination addresses are, or you can break it down by server and get the same for every single device in your environment; the thing is, that's not helpful in establishing anything really. 

If you have a proper change management process in place, all of the other major stakeholders should know that an upgrade is going to take place at time on date. Post upgrade, you simply ask application owners to verify that things are working as expected. If it was working before it should stay working, broken before and it should still be broken.

 

You can really break upgrades down into two different categories.

 

Major Release Upgrades

There's a chance that things go wrong here, or that newly activated dyanmic content like threat signatures or app-ids could cause issues. When I perform these upgrades I'm looking for the following things. 

  1. Normal Traffic Logs
    1. Your session logs shouldn't be showing an abnormal amount of aged-out traffic.
    2. Your session logs shouldn't be showing new internal traffic getting denied due to new app-ids. IE: If you've allowed SSL on port 443 and the traffic is now being identified as Jira for example that could cause issues. You're looking for new classifications that could be breaking traffic.
  2. Normal Threat Logs
    1. Are you seeing an increase in identified threats?
    2. Are you seeing any new threats being identified on internal zones?
  3. User-ID
    1. Is the user-id database being propagated properly?
    2. Are users showing as expected for your firewall (ie: DOMAIN\USER or USER@DOMAIN)
  4. Decryption
    1. Is outbound decryption working properly?
    2. Is inbound decryption working properly?
  5. NAT
    1. Are my publicly accessible resources still publicly accessible? 

Maintenance Releases:

I'm looking at the same things above, but I'm going to spend less time looking at logs on my end. You aren't getting any new signatures that weren't already active (unless your also manually activating new content updates) and you shouldn't expect anything to really change. I'm still looking at logs and verifying publicly accessible things are available, but generally speaking these just work.

 

Helpful Hints:

  • Automation is your friend when it comes to updates. I have a script that checks the status of all of our internal and public websites and services prior to performing the upgrade. Then I'll validate the same responses are recorded once the upgrade is complete. Toss the results into something like REDIS and simply verify that they are the same pre and post upgrade. In essence what I'm checking in an automated fashion is that a site which gave me a 200 prior to the upgrade is still giving me a 200. It's also a really good thing to have when someone reports the upgrade broke a service; if you can prove it was giving a 403 response prior to the upgrade, then the upgrade didn't cause the 403 issue. That issue was existing and you have the logs to prove it was already an issue.
  • Automate as much of your health checks as possible. If you have a monitoring solution that does this for you then great, but if you don't then build your own scripts to do it. This doesn't just go for websites, but also critical databases, applications, ect. If you have a way to automate a service/health check (and you should) then do it. Don't rely on the application owner to do this for you. Verify critical service reachability following the upgrade with the same script so you know right away if something is having issues post upgrade.
  • If you don't already have one, create a security rulebase catch-all for denied internal traffic you can monitor easily. IE: Have a entry that captures all of the denied traffic from your internal zones so you can quickly filter based off of that rulebase entry. This will allow you to simply check the session logs for denied traffic following the upgrade to verify that a new app-id isn't causing you any issues. 

 

The problem with what you're looking to provide is that you simply can't tell applications are actually working based off of traffic flows. You might notice something abnormal or identify a new app-id that wasn't accounted for, or a new threat signature causing false-positive matches on your internal traffic, but you won't be able to say for sure that the application is actually working as intended. You're really only able to say when it's broken from a network aspect. 

 

I absolutely wouldn't recommend giving an all-clear following a major version upgrade on the firewall, or even really a minor upgrade. I perform the upgrade and do my basic service checks to verify that from a networking aspect the upgrade is done and appears to be working properly. It's then up to the application/service owners to verify that they're things are actually working. 

View solution in original post

6 REPLIES 6

Cyber Elite
Cyber Elite

@qasim02,

I really wouldn't rely on a report for this. Talk to your application owners and have them verify their applications once the new firewall is installed. The firewall reports are just going to tell you what traffic is passing, but that isn't going to tell you if things are actually working properly. 

Thanks @BPry 

I am not replacing the firewall or putting a new firewall in but rather upgrading the pan-os on existing firewall. 

Just need a report really as a head start so I can forward it to different teams to say that this is the current picture of the flows from the firewall side.

Is this something you can  hell with please?

 

Kind regards,

@qasim02 

 

As @BPry  mentioned that is the best way to verify that applications are working.

We have so many public facing applications in our PA and whenever I do any PAN OS upgrade I send email to application team 

that upgrade is done and then they verify  that applications  are working fine.

 

PA you can only see the session numbers in the GUI and traffic logs showing traffic is passing for those applications and assume all

is good.

 

Regards

MP

Help the community: Like helpful comments and mark solutions.

@qasim02,

Again, a report really isn't going to tell you or the other teams much of anything. It'll tell you what the top source or destination addresses are, or you can break it down by server and get the same for every single device in your environment; the thing is, that's not helpful in establishing anything really. 

If you have a proper change management process in place, all of the other major stakeholders should know that an upgrade is going to take place at time on date. Post upgrade, you simply ask application owners to verify that things are working as expected. If it was working before it should stay working, broken before and it should still be broken.

 

You can really break upgrades down into two different categories.

 

Major Release Upgrades

There's a chance that things go wrong here, or that newly activated dyanmic content like threat signatures or app-ids could cause issues. When I perform these upgrades I'm looking for the following things. 

  1. Normal Traffic Logs
    1. Your session logs shouldn't be showing an abnormal amount of aged-out traffic.
    2. Your session logs shouldn't be showing new internal traffic getting denied due to new app-ids. IE: If you've allowed SSL on port 443 and the traffic is now being identified as Jira for example that could cause issues. You're looking for new classifications that could be breaking traffic.
  2. Normal Threat Logs
    1. Are you seeing an increase in identified threats?
    2. Are you seeing any new threats being identified on internal zones?
  3. User-ID
    1. Is the user-id database being propagated properly?
    2. Are users showing as expected for your firewall (ie: DOMAIN\USER or USER@DOMAIN)
  4. Decryption
    1. Is outbound decryption working properly?
    2. Is inbound decryption working properly?
  5. NAT
    1. Are my publicly accessible resources still publicly accessible? 

Maintenance Releases:

I'm looking at the same things above, but I'm going to spend less time looking at logs on my end. You aren't getting any new signatures that weren't already active (unless your also manually activating new content updates) and you shouldn't expect anything to really change. I'm still looking at logs and verifying publicly accessible things are available, but generally speaking these just work.

 

Helpful Hints:

  • Automation is your friend when it comes to updates. I have a script that checks the status of all of our internal and public websites and services prior to performing the upgrade. Then I'll validate the same responses are recorded once the upgrade is complete. Toss the results into something like REDIS and simply verify that they are the same pre and post upgrade. In essence what I'm checking in an automated fashion is that a site which gave me a 200 prior to the upgrade is still giving me a 200. It's also a really good thing to have when someone reports the upgrade broke a service; if you can prove it was giving a 403 response prior to the upgrade, then the upgrade didn't cause the 403 issue. That issue was existing and you have the logs to prove it was already an issue.
  • Automate as much of your health checks as possible. If you have a monitoring solution that does this for you then great, but if you don't then build your own scripts to do it. This doesn't just go for websites, but also critical databases, applications, ect. If you have a way to automate a service/health check (and you should) then do it. Don't rely on the application owner to do this for you. Verify critical service reachability following the upgrade with the same script so you know right away if something is having issues post upgrade.
  • If you don't already have one, create a security rulebase catch-all for denied internal traffic you can monitor easily. IE: Have a entry that captures all of the denied traffic from your internal zones so you can quickly filter based off of that rulebase entry. This will allow you to simply check the session logs for denied traffic following the upgrade to verify that a new app-id isn't causing you any issues. 

 

The problem with what you're looking to provide is that you simply can't tell applications are actually working based off of traffic flows. You might notice something abnormal or identify a new app-id that wasn't accounted for, or a new threat signature causing false-positive matches on your internal traffic, but you won't be able to say for sure that the application is actually working as intended. You're really only able to say when it's broken from a network aspect. 

 

I absolutely wouldn't recommend giving an all-clear following a major version upgrade on the firewall, or even really a minor upgrade. I perform the upgrade and do my basic service checks to verify that from a networking aspect the upgrade is done and appears to be working properly. It's then up to the application/service owners to verify that they're things are actually working. 

@BPry @MP18 

Many thanks guys for the invaluable and swift  insight. It is highly appreciated!

Spoiler
Spoiler
 

 

Btw, what do you use to script your post/pre checks? I will try give it a shot as well.

any chance you can share your script so I can adapt it to my environment?

  • 2 accepted solutions
  • 4905 Views
  • 6 replies
  • 0 Likes
Like what you see?

Show your appreciation!

Click Like if a post is helpful to you or if you just want to show your support.

Click Accept as Solution to acknowledge that the answer to your question has been provided.

The button appears next to the replies on topics you’ve started. The member who gave the solution and all future visitors to this topic will appreciate it!

These simple actions take just seconds of your time, but go a long way in showing appreciation for community members and the LIVEcommunity as a whole!

The LIVEcommunity thanks you for your participation!