Great - one last question. What if I have a comma delimited file that I want to parse various fields out of. For example:
#IP, date, category, ...
1.1.1., 2017-01-01, bot, ...
you could have success using minemeld.ft.csv.CSVFT class to parse the CSV file and skip the comment using the ignore_regex parameter. See the docs about the parameters accepted by that Miner class here:
And you can use bambenekconsulting.c2_ipmasterlist as a starting prototype for your experiments.
Thanks for all your help so far...one other question - if the feed you're downloading is gzipped, what is the appropriate way to gunzip the file for processing within minemeld?
If the file is compressed by the HTTP Server on the fly (https://en.wikipedia.org/wiki/HTTP_compression) in gzip, the python library used by the CSV and HTTP Miner (that is python requests) should automatically take care of decompressing the file.
If instead the feed is contained in a gzip file you need a new Miner subclassing the HTTP or CSV Miner to decmpress gzip on the fly. This is possible and easy to do but it requires some coding.
Thanks for your reply. Actually, the file is stored on the webserver gzipped, so I think I will need to code something myself to gunzip the file.
Is there an example somewhere I can look at for reference?
Thanks for the additional tips, it'd be great to get those in the documentation if possible. I mean these two additional steps:
that guide should be updated, there are 2 additional steps:
Actually, do you think we could get a guide on writing external extensions? Maybe it could replace the existing "write a simple miner" guide in the wiki.
I had the same issues in writing my miner (this one for Imperva's "Incapsula" cloud WAF public IP ranges), though after rebooting the VM it seems to have successfully updated everything and the miner is functional. I'm attaching the following files:
/opt/minemeld/engine/core/minemeld/ft/incapsula.py /opt/minemeld/local/prototypes/incapsula.yml /opt/minemeld/engine/core/nodes.json
I've looked at the youtube-miner repo but as a non-developer would find it a little helpful to get a high-level outline of the required structure for an external extension. It would be nice to be able to rewrite this standard miner as an extension.
I've been trying to rewrite my incapsula miner as an external extension by parroting the youtube-miner example, but after installing it via the external extension menu under System > Extensions > Git and successfully activating it, I get the "COMMIT FAILED: Unknown node class minemeld.ft.incapsula.IPv4 in miner_incapsula_ipv4" in the web UI.
I am attaching my minemeld-engine.log, minemeld-web.log, and supervisor.log. Also, here is the link to the github repo containing the extension:
I'd be very appreciative of any pointers you could provide! I'm assuming there is some additional config required in my extension in order to force an update the local nodes.json in my minemeld VM?
@nbilal : There are a couple of issues.
First, you're duplicating entry points in the minemeld.json file. The second entry should be "incapsulaminer.IPv6" instead of "incapsulaminer.node:IPv4".
Then, in the prototype file (incapsula.yml), you should reference these entry points (incapsulaminer.IPv4 and incapsulaminer.IPv6) instead of the non-existant ones minemeld.ft.incapsula.IPv4 and minemeld.ft.incapsula.IPv6
Thanks @xhoms. ...rookie mistakes! I also had to fix a bad import statement (minemeld.ft can be referenced as "." in a local miner, but the full path "minemeld.ft.x" must be given in the external extension).
We are good to go!
Thanks again for your support,
Click Accept as Solution to acknowledge that the answer to your question has been provided.
The button appears next to the replies on topics you’ve started. The member who gave the solution and all future visitors to this topic will appreciate it!
These simple actions take just seconds of your time, but go a long way in showing appreciation for community members and the LIVEcommunity as a whole!
The LIVEcommunity thanks you for your participation!