- Access exclusive content
- Connect with peers
- Share your expertise
- Find support resources
Enhanced Security Measures in Place: To ensure a safer experience, we’ve implemented additional, temporary security measures for all users.
on 11-23-2022 06:20 AM - edited on 11-23-2022 06:22 AM by jforsythe
Episode Transcript:
Hello, PANCasters. Welcome back and, I have to say, I’m pretty excited to be hitting episode five already! Today, we are talking about logs and why they are your best friend. If you are just joining us on the PANCast journey then make sure you have a listen to the previous episodes as well. They won’t take up too much of your time and hopefully you’ll come out of it with a few key takeaways to make sure tomorrow is more secure than today!
Before we get into some specifics, why are logs so important? Some people have the opinion that logs are an add-on, good to have but not a critical part of the overall infrastructure. I disagree. Whether we are talking about firewall logs such as traffic logs, URL logs and threat logs or the lower level debug logs on the firewall. Without them, we have no idea what’s going on. This can affect troubleshooting, compliance with local regulations and also post incident reviews. What happens without logs? That virus gets through undetected. Dave in accounts is visiting some pretty dodgy websites due to a configuration issue but no one knows. And that network blip on Friday night may be hard to troubleshoot if we don’t have the relevant debug logs.
The second and equally important part of this is that it is no use generating all these logs if then no action is taken because no-one knows about it. So, along with knowing your logging setup, you really need a system to ensure logs are reviewed and actions taken.
Logs are important. Let’s discuss some details about logging on Palo Alto Networks firewalls.
First of all, as I kind of alluded to at the start, we have different types of firewall logs. The ones available on the web UI are for things like traffic logs, URL logs, System logs and userID logs. Then there are debug logs which are only available via CLI. These are used when troubleshooting a specific issue and are not normally kept long-term as they are low level processing logs.
Starting with the logs for user traffic, let’s go through some points. These logs are stored locally on each firewall but each firewall has finite storage so cannot be kept long-term. There is a process to aggregate them over longer periods but if your requirements are for longer term storage then you really need to get the logs off the firewall and there are a number of ways you can do this. Panorama and log collectors work with your firewalls to enable longer term storage but please be aware this depends on your logging rate and Panorama models. If you need long-term storage, you can look at using log forwarding so the firewalls or Panorama can directly forward your logs to a syslog server. And finally, Palo Alto Networks also offers Cortex Data Lake (or CDL for short), which is cloud-based storage. Not only does it offer long-term log storage, but as this now contains a wealth of data, additional apps can be integrated with CDL to offer a host of other features.
One final note on these logs, specifically for things like traffic logs and URL filtering logs: Your configuration plays a role in whether logs are actually generated or not, so it is important to make sure your configuration is as you want it. For example, you can have a security policy that does not have logging enabled. In this case, any traffic that matches this policy will not generate a log. Likewise with URL filtering there are different actions. The default action called “allow” will allow the traffic but will not generate a URL log. These are key to understand in your logging deployment.
Let’s move on and discuss troubleshooting with logs. By this, I mean something is or was not working properly on the firewall so you are investigating why. As an example, there was a period over the weekend that users reported some Internet access issues. The first thing is to take a top-down approach. Start with general logs and then step down to more specific logs. Start with the system logs at the time. Do you see anything unusual? You may see obvious things such as routing flaps but you may also notice some not so obvious signs that can help with troubleshooting. Going back to the example of users not being able to access the Internet, maybe there are no routing issues or critical alarms on the firewall but interestingly you notice at the same time the firewall could not connect to the update server.
We can now look at the traffic logs at that time. Do we see sessions that look normal (for example they are ending with tcp-fin) or does it look like all sessions are failing? Things to suggest the sessions are failing include a lot of aged-out sessions for tcp or a lot of applications showing as incomplete. If all sessions are failing then it looks widespread, and if there are no obvious signs such as interfaces down or routing changes then it would be worth checking the upstream devices. Now that is not to say the issue is definitely not on the firewall but with no clear indicators it is worth checking some other options. If the sessions generally look ok, then filtering the logs will help and we would need some specifics such as the username or source IP address. We can then focus on those logs and again see what the sessions look like.
This is an example of just using the logs available on the web UI but from CLI there is a range of debug logs available that can also be useful. Let’s look at another example. In this case, you have a new IPSec site to site VPN that is not coming up. The system logs show the tunnel is not coming up but does not have specific details as to why. From the CLI you can view the specific logs for the process that handles IPSec VPN to see if there is any further information on why the connection is failing. This may actually provide the relevant details. For example there is a mismatch in the phase 2 config between the peers.
Just a few additional things to note regarding the CLI debug logs. These can be helpful when troubleshooting specific features, for example userID. There are different debug levels which can be changed but this should only be done either with a TAC engineer on a call or with guidance from TAC. The final note about the CLI debug logs is that as I mentioned, they are meant for real time troubleshooting so they are not kept indefinitely. What that means practically is that if you have an issue with a firewall and you are not sure what the cause is, it is worth capturing a tech support file as soon as possible after the issue is seen so that the relevant logs can be captured for TAC to review.
The final thing I want to talk about is something in general about log review and firewall troubleshooting. When you are doing a top down approach and starting with the system logs, if you have a specific time an issue was seen then review the logs at that time, but also have an open mind about what you are looking for. If I can finish with one last example that might help explain.
You come in on Monday morning and all users are now complaining that they cannot access a certain internal application. Everything was fine on Friday. Luckily this app is used all the time and the help desk lets you know that the issue first started at 11:30 p.m. on Saturday night. So you can start your top down review by looking at the system logs at that time. A quick check reveals no critical alerts, you use OSPF and there are no routing issues seen either. Looks all good and time to start looking at traffic logs to get more detail. But just as you are about to move onto the traffic logs, you notice there was an apps-and-threats update at 11:30 p.m., around the time the issues first started. Now you have some additional information to help. You can continue the normal troubleshooting process but you are also now armed with the knowledge that there was an apps-and-threats update at the time. This could potentially be the cause because of a change in appID and security policy now not allowing that traffic.
And that’s a wrap for today’s episode. I hope the main thing you come away with is how important logs are but also remember:
Check out the full PANCast YouTube playlist: PANCast: Insights for Your Cybersecurity Journey.
Related Content: