- Access exclusive content
- Connect with peers
- Share your expertise
- Find support resources
Enhanced Security Measures in Place: To ensure a safer experience, we’ve implemented additional, temporary security measures for all users.
11-30-2016 04:14 AM
We've installed MM on Ubuntu 14.04 and everything starts and seems to work OK initially.
However, after a period of time it seems tro crash. Not really sure how log, but as an example I booted yesterday used if fine for an hour or so, and this morning it had failed.
A typical error (top right in red box) would be ERROR RETRIEVING MINEMELD CONFIG: Internal Server Error. - see screenshot attachment.
If I restart the minemeld service everything starts and all is good again for a period of time. Nothing jumps out in the logs - is there any advice you can give on things to check?
Thanks
11-30-2016 04:30 AM
I have same problem, but my minemeld on Ubuntu 14.05 is running syslog miner/analyzer with significant number of logs per second received from firewall. It crashed every day, after about 20-30h. Luigi advised to add CPU, I have now 4x4 cores (4g ram) . It's up and running since 18h , will see...
11-30-2016 05:19 AM
Thanks - will look into it.
Our deployment is fairly light - 2GB, 2vCPU - but the only processing we're doing over the default config is 2 new IP sources of ~70k addresses, so no in-line syslog processing.
Cheers
11-30-2016 05:26 AM - edited 11-30-2016 06:16 AM
Hi @apackard,
70k addresses are a really low volume for MineMeld. Would you mind sending me the minemeld-engine.log file from /opt/minemeld/log ? My email address is lmori@paloaltonetworks.com
Additional things:
- could you check memory and disk of the instance to see if they are exhausted ?
- are you using one or more taxii data feed output nodes ? those are memory hungry, next release will cut memory usage of taxii data feeds by more than 75%.
Thanks,
luigi
11-30-2016 07:38 AM - edited 11-30-2016 07:46 AM
lmori,
I've uploaded a couple of screenshots to show the current setup:-
Resource_Use =>Triggered a reload of the largest IP list, showing the OS level stats (htop) and MM UI reported stats. Probably a little disingenuous as CPU on the OS hits 100% but only for a few seconds (I missed it with the screenshot), and I suspect that the refresh period on the MM UI means it lags a little.
Nodes => Our nodes: we've created 2 new inputs, 1 aggregator and 1 output, plus the default ones. The inputs are based on the minemeld.ft.http.HttpFT prototype
Flows => Connections
Will attach the log to our another message as looks like 3 is max...
11-30-2016 07:39 AM - last edited on 11-30-2016 11:18 AM by lmori
Log file (replaced any sensitive names\IP's with fake strings)
11-30-2016 11:46 AM
Hi @apackard,
the volume of indicators I see from your screenshot should be handled pretty well by MineMeld with those memory and CPU resources. Would you mind uploading also the /opt/minemeld/log/minemeld-web.log file ?
If you prefer you can send it directly to me at lmori@palo...
Thanks!
luigi
12-01-2016 06:33 AM
Hi @niuk,
do you just reboot the instance or do something more ?
Could you run this command before reboot to check which process is using most of the memory ?
$ top -b -n 1 -o %MEM
About the disk, are you erasing files before reboot ? I am asking because it's strange that a reboot alone could free space from disk.
apackard problem should be different, his instance is handling a pretty low volume of indicators.
12-02-2016 04:44 AM
Hi - sorry for delay. While arranging to get the file off I noted that it was flooding with these errors:-
Traceback (most recent call last):
File "/opt/minemeld/engine/0.9.28/local/lib/python2.7/site-packages/gevent/baseserver.py", line 140, in _do_read
File "/opt/minemeld/engine/0.9.28/local/lib/python2.7/site-packages/gevent/server.py", line 93, in do_read
error: [Errno 24] Too many open files
<StreamServer at 0x7fbffaa0cd90 fileno=5 address=127.0.0.1:5000 handle=<functools.partial object at 0x7fc001c0e8e8>> failed with error
I restarted and they stopped, so may be a good indicator?
Rgds
12-02-2016 04:53 AM
I can see that number of open files is bigger than max on mine Ubuntu too..I think it can be easily increased
minemeld@minemeld:/opt/minemeld/prototypes/current$ lsof | wc -l 8087 minemeld@minemeld:/opt/minemeld/prototypes/current$ ulimit -a | grep open open files (-n) 1024
12-02-2016 06:08 AM
Hi @apackard,
thanks, that is really helpful. I checked the logs of the engine and everything was normal except for an issue with reaching ransomwaretracker.
Before increasing the number of opened files, I would like to understand if there is a leak of file descriptors, if you run
$ sudo ps -aef | grep gunicorn
You will find 2 processes. Could you dump the open files with "lsof -p <pid>" for each process and check if most of them are session to redis (port 6379) or rabbitmq (port 5672) ?
Do you have many firewalls/devices retrieving feeds from MM ?
Thanks,
luigi
12-02-2016 06:22 AM
Will do.
In terms of the question:-
We currently have an IP block list provided by a 3rd party. I have some custom PS scripts that I currently run that downloads this, produces DIFF reports, does some mangling and outputs as a file for serving up on an internal web server for our Internet facing firewalls (about a dozen).
I'm looking to replace this with MineMeld so in future it will be supporting at least 10 devices; but until we can work out why it keeps stopping we can't proceed - so right now there isn't actually any client devices etc.
I'm also hoping to use some dynamic behaviouir to get round some limitations in your dynamic blocklist max sizes and block-ip duration. As we can only serve up ~1,200 IP's (out of the 50k plus in the 3rd party IP list), and as we can only block an IP for 1 hour with THREAT block-ip action, I have a SIEM that triggers a script if it sees any of the the "non-served" IP's attacking us, or if it sees repeated block-ip actions from a common source.
This will poke an offending IP to a smaller 'active' attackers list that we can use for a dynamic blocklist that will have a lifetime of a month (ex.), once that functionality is in place we may serve up to our full estate of PA's, with is over 30.
Click Accept as Solution to acknowledge that the answer to your question has been provided.
The button appears next to the replies on topics you’ve started. The member who gave the solution and all future visitors to this topic will appreciate it!
These simple actions take just seconds of your time, but go a long way in showing appreciation for community members and the LIVEcommunity as a whole!
The LIVEcommunity thanks you for your participation!