I am exploring the best method for me to export and create reports based off of log data from a PA-4020, and have a few questions. Real-time reporting hasn't been determined just yet, but since I can essentially get real-time from the PA GUI I am thinking this will not be a requirement. That being said I think a 1/2 daily or 1/4 daily ftp export may be what we persue. When I run this export I just want to be sure that I am only exporting "new" data, and that no duplicate data will get exported. I am assuming this is the case, but want to get a for sure yes or no from someone here. How much overhead will the box incur during these exports? Will running these exports every 6 hours create problems for GUI usability? I am assuming that this takes place on the management plane and will not effect our normal network traffic.
Many thanks in advance!
At the present time each log export is a 24 hour export. So if you schedule an export every four hours each file will contain 24 hours of log data.
The PA firewall does not currently have the ability to do smaller log exports on a schedule. If you would like to see this feature added to the product please contact your sales rep and have them submit a feature request on your behalf.
As far as I know the data is still present in the PAN unit (untill it runs out of diskspace and therefor deletes loglines to free up space - I think there is even a setting for this for example if you want it to block all traffic if there is no room to log and you wont allow it to free up space on its own).
What is dumb?
The logspace on the PAN in default mode is like a ringbuffer (or FIFO for that matter) - oldest entries gets deleted.
Approx 2 years ago I had a feature request that the ftp export of logs from PAN should be compressed using "gzip -9" before sent to the FTP server but I dunno what happend to that request - perhaps someone else knows status of compressing logs when exporting?
Another method is to use a live syslog feed (even using CEF mode if you like to in case you export to an Arcsight setup) and on the syslog server create filters on what you wish to archive (or for that matter how you wish to archive/compress the logs).
"What is dumb?" --> The fact that the unit creates such a massive CSV file, with no way to tune that file in terms of either size or what fields are stored. We have had nothing but issues with our third-party log management product (which is in no way Palo's fault), but as a result, I've attempted a number of times to generate simple reports from the raw CSV files and it just can't be done when the file reaches such a massive size. As an example, try opening a 2GB CSV file in Excel. Excel will choke badly on that. I wouldn't every suggest doing that, but there may be cases where it's needed, like us, where we're in transition from a 3rd party tool to Panorama for log management/reporting.
Palo is a great product, but at times I feel as though the marketing got in the way. Small features that I've come to love from my former Bluecoats are not even in the roadmap? It boggles my mind. Such an advanced firewall on some pieces, such limited options on others.
Ahh yes but using excel for such task is just wrong in so many ways :-)
Regarding Bluecoat their own Reporter tool couldnt deal with the amount of logs our network produced without crashing so I had to fix the log extraction problem on my own (with my own tools written in perl and using mysql as backend - roughly 1 gbyte gzip-compressed log-file per day and this was back in 2008 with limited amount of users =)
Did you file this as an feature enhancement so one can select smaller size or smaller timeranges for what is about to get exported?
I agree that historical logging and reporting are rather disappointing for PA, at least the last time I got to play around with these. If I remember correctly Panorama still suffered some of the reporting short falls the actual firewall suffered from, hence my desire to export them to another location. Hopefully this has changed. I am hoping to do some more research on this in the upcoming months. I have learned that one of the products I am familiar with, manageengine's firewall analyzer, has just recently started supporting PA. Now I cannot speak to the verbosity of the reports, or how easy it is to dig for the juicy data the PA outputs, but in my experience with this product and other firewall's it does a pretty good job. The reports that I'd be interested in seeing, and hoping exist, would be very detailed web usage, user activity, threat activity etc. I plan on using their trial to experiment and see if it meets my needs. When that happens I'll be happy to post my findings.
And while I was typing this I recieved notification of your question about if I filed a feature request. I have not, primarily because my project to implement these was put on hold.
I do have a feature request in queue to allow for the selection of fields that get logged, but not adjustment of the size itself. The ability to remove the unnecessary fields would easily fulfill my need for smaller files since about 1/3 - 1/2 of the data is not used in my case.
On Bluecoat reporter, yes, it sucked! I was referring to the ProxySG's themselves, they had a feature to adjust the size and interval of sending a log file. When they sent the file, the previous was removed so there was never a case where it would re-send the same log data, which from Palo Alto's reponse above is how there device would act if you were to send logs multiple times per day.
On the Panorama note, I'm on a recent implementation of it, the 4.1.1 release and 4.1.1 on my 2050s and it does seem to be much better now in reporting. Though, it is lacking still one major need. There does not appear to be a way to generate reports on ranges of IP addresses (so for example, all of a given server farm in a datacenter). I've found a manual workaround though -- I created a new rule on my 2050s that explicitly match that server farm range with only the action of logging in place, no URL/etc applied. I placed that just before the final "allow any any" rule (we use our 2050s just for web filtering, there's separate firewalls between them and our Internet edge). This seems to solve that issue as I am able to generate reports based on a given rule name.
One note though, if you're looking to add Panorama "after the fact" -- out of the box there's no way to suck in the policy from a live PA. It was a tedious process but I was able to do so through comparing the XML files that the device and panorama save out when doing a "save/export named configuration snapshot". It was still a lot of work though, because you have to fully nuke/erase the PA-2050s policy (well, anything that you try to push from Panorama) to make it work. I've succesfully done this w/ 4 x 2050s now, so it's doable if you keep at it. It was easier though in the long run than managing 4 independant boxes (non-HA).
Click Accept as Solution to acknowledge that the answer to your question has been provided.
The button appears next to the replies on topics you’ve started. The member who gave the solution and all future visitors to this topic will appreciate it!
These simple actions take just seconds of your time, but go a long way in showing appreciation for community members and the LIVEcommunity as a whole!
The LIVEcommunity thanks you for your participation!