Minemeld Crashing, miner tab not loading, RPC timeout exception

cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Announcements

Content translations are temporarily unavailable due to site maintenance. We apologize for any inconvenience. Visit our blog to learn more.

Minemeld Crashing, miner tab not loading, RPC timeout exception

L1 Bithead

Hi,

we have an issue on our Minemeld instance in production. Similar to the issue reported in https://live.paloaltonetworks.com/t5/minemeld-discussions/minemeld-crashing/td-p/289998, minemeld randomly crashes with the following results:

- the green loading bar keeps running across the screen

- the nodes page won't load

- TAXII output prototype is giving a bad gateway 502 to TAXII clients.

- ‘timeout RPC’ exception generated from minemeld-web.log

 

A reboot of the server seems to resolve the issue just temporarely as it reoccurs approx after 48 hours.

From the logs (and UI interface) it seems that the engine (minemeld-engine.log) is not revealing any issues at all:

2020-10-14T11:35:22 (1784)table._query_by_index INFO: Deleted in scan of _age_out: 0
2020-10-14T11:35:25 (1784)basepoller._actor_loop INFO: ML_URL_PIS_openphish_feed_txt_MCG_30dSD - command: 1602668125764 age_out
2020-10-14T11:35:25 (1784)table._query_by_index INFO: Deleted in scan of _age_out: 0
2020-10-14T11:35:30 (1785)basepoller._huppable_wait INFO: hup is clear: False
2020-10-14T11:35:30 (1785)basepoller._actor_loop INFO: MC_IPv4_PIS_dshield_blocklist_txt_HCG_nilSD_NOVRFY - command: 1602668130987 poll
2020-10-14T11:35:30 (1785)basepoller._polling_loop INFO: Polling MC_IPv4_PIS_dshield_blocklist_txt_HCG_nilSD_NOVRFY
2020-10-14T11:35:31 (1785)basepoller._actor_loop INFO: MC_IPv4_PIS_dshield_blocklist_txt_HCG_nilSD_NOVRFY - command: 1602668130987 sudden_death
2020-10-14T11:35:31 (1785)table._query_by_index INFO: Deleted in scan of _last_run: 20
2020-10-14T11:35:31 (1785)basepoller._actor_loop INFO: MC_IPv4_PIS_dshield_blocklist_txt_HCG_nilSD_NOVRFY - command: 1602668130987 age_out
2020-10-14T11:35:31 (1785)table._query_by_index INFO: Deleted in scan of _age_out: 0
2020-10-14T11:35:31 (1785)basepoller._actor_loop INFO: MC_IPv4_PIS_dshield_blocklist_txt_HCG_nilSD_NOVRFY - command: 1602668130987 gc
2020-10-14T11:35:31 (1785)table._query_by_index INFO: Deleted in scan of _withdrawn: 0
2020-10-14T11:35:44 (1784)basepoller._actor_loop INFO: wl_URL_generic - command: 1602668144137 age_out

while on the web side (minemeld-web.log) it reports the following error:

[2020-10-14 11:35:20 CEST] [1671] [DEBUG] redis session connection pool: in use: 0 available: 5
[2020-10-14 11:35:20 CEST] [1671] [DEBUG] RPC sent to @mbus:master:rpc for method status
[2020-10-14 11:35:29 CEST] [1671] [DEBUG] 0
[2020-10-14 11:35:29 CEST] [1671] [ERROR] Exception on /feeds/OL_domain_MAL [GET]
Traceback (most recent call last):
File "/opt/minemeld/engine/0.9.70.post1/local/lib/python2.7/site-packages/flask/app.py", line 1982, in wsgi_app
response = self.full_dispatch_request()
File "/opt/minemeld/engine/0.9.70.post1/local/lib/python2.7/site-packages/flask/app.py", line 1614, in full_dispatch_request
rv = self.handle_user_exception(e)
File "/opt/minemeld/engine/0.9.70.post1/local/lib/python2.7/site-packages/flask/app.py", line 1517, in handle_user_exception
reraise(exc_type, exc_value, tb)
File "/opt/minemeld/engine/0.9.70.post1/local/lib/python2.7/site-packages/flask/app.py", line 1612, in full_dispatch_request
rv = self.dispatch_request()
File "/opt/minemeld/engine/0.9.70.post1/local/lib/python2.7/site-packages/flask/app.py", line 1598, in dispatch_request
return self.view_functions[rule.endpoint](**req.view_args)
File "/opt/minemeld/engine/0.9.70.post1/local/lib/python2.7/site-packages/minemeld/flask/aaa.py", line 125, in decorated_view
return f(*args, **kwargs)
File "/opt/minemeld/engine/0.9.70.post1/local/lib/python2.7/site-packages/minemeld/flask/aaa.py", line 135, in decorated_view
return f(*args, **kwargs)
File "/opt/minemeld/engine/0.9.70.post1/local/lib/python2.7/site-packages/minemeld/flask/feedredis.py", line 532, in get_feed_content
status = MMMaster.status()
File "/opt/minemeld/engine/0.9.70.post1/local/lib/python2.7/site-packages/minemeld/flask/mmrpc.py", line 49, in status
return self._send_cmd('status')
File "/opt/minemeld/engine/0.9.70.post1/local/lib/python2.7/site-packages/minemeld/flask/mmrpc.py", line 45, in _send_cmd
timeout=500.0
File "/opt/minemeld/engine/0.9.70.post1/local/lib/python2.7/site-packages/minemeld/comm/zmqredis.py", line 695, in send_rpc
raise RuntimeError('Timeout in RPC')
RuntimeError: Timeout in RPC
[2020-10-14 11:35:29 CEST] [1671] [DEBUG] 0
[2020-10-14 11:35:29 CEST] [1671] [ERROR] Exception on /feeds/OL_IPv4_MAL [GET]
Traceback (most recent call last):
File "/opt/minemeld/engine/0.9.70.post1/local/lib/python2.7/site-packages/flask/app.py", line 1982, in wsgi_app
response = self.full_dispatch_request()
File "/opt/minemeld/engine/0.9.70.post1/local/lib/python2.7/site-packages/flask/app.py", line 1614, in full_dispatch_request
rv = self.handle_user_exception(e)
File "/opt/minemeld/engine/0.9.70.post1/local/lib/python2.7/site-packages/flask/app.py", line 1517, in handle_user_exception
reraise(exc_type, exc_value, tb)
File "/opt/minemeld/engine/0.9.70.post1/local/lib/python2.7/site-packages/flask/app.py", line 1612, in full_dispatch_request
rv = self.dispatch_request()
File "/opt/minemeld/engine/0.9.70.post1/local/lib/python2.7/site-packages/flask/app.py", line 1598, in dispatch_request
return self.view_functions[rule.endpoint](**req.view_args)
File "/opt/minemeld/engine/0.9.70.post1/local/lib/python2.7/site-packages/minemeld/flask/aaa.py", line 125, in decorated_view
return f(*args, **kwargs)
File "/opt/minemeld/engine/0.9.70.post1/local/lib/python2.7/site-packages/minemeld/flask/aaa.py", line 135, in decorated_view
return f(*args, **kwargs)
File "/opt/minemeld/engine/0.9.70.post1/local/lib/python2.7/site-packages/minemeld/flask/feedredis.py", line 532, in get_feed_content
status = MMMaster.status()
File "/opt/minemeld/engine/0.9.70.post1/local/lib/python2.7/site-packages/minemeld/flask/mmrpc.py", line 49, in status
return self._send_cmd('status')
File "/opt/minemeld/engine/0.9.70.post1/local/lib/python2.7/site-packages/minemeld/flask/mmrpc.py", line 45, in _send_cmd
timeout=500.0
File "/opt/minemeld/engine/0.9.70.post1/local/lib/python2.7/site-packages/minemeld/comm/zmqredis.py", line 695, in send_rpc
raise RuntimeError('Timeout in RPC')
RuntimeError: Timeout in RPC
[2020-10-14 11:35:29 CEST] [1671] [DEBUG] 0
[2020-10-14 11:35:29 CEST] [1671] [ERROR] Exception on /feeds/OL_URL_MAL [GET]
Traceback (most recent call last):
File "/opt/minemeld/engine/0.9.70.post1/local/lib/python2.7/site-packages/flask/app.py", line 1982, in wsgi_app
response = self.full_dispatch_request()
File "/opt/minemeld/engine/0.9.70.post1/local/lib/python2.7/site-packages/flask/app.py", line 1614, in full_dispatch_request
rv = self.handle_user_exception(e)
File "/opt/minemeld/engine/0.9.70.post1/local/lib/python2.7/site-packages/flask/app.py", line 1517, in handle_user_exception
reraise(exc_type, exc_value, tb)
File "/opt/minemeld/engine/0.9.70.post1/local/lib/python2.7/site-packages/flask/app.py", line 1612, in full_dispatch_request
rv = self.dispatch_request()
File "/opt/minemeld/engine/0.9.70.post1/local/lib/python2.7/site-packages/flask/app.py", line 1598, in dispatch_request
return self.view_functions[rule.endpoint](**req.view_args)
File "/opt/minemeld/engine/0.9.70.post1/local/lib/python2.7/site-packages/minemeld/flask/aaa.py", line 125, in decorated_view
return f(*args, **kwargs)
File "/opt/minemeld/engine/0.9.70.post1/local/lib/python2.7/site-packages/minemeld/flask/aaa.py", line 135, in decorated_view
return f(*args, **kwargs)
File "/opt/minemeld/engine/0.9.70.post1/local/lib/python2.7/site-packages/minemeld/flask/feedredis.py", line 532, in get_feed_content
status = MMMaster.status()
File "/opt/minemeld/engine/0.9.70.post1/local/lib/python2.7/site-packages/minemeld/flask/mmrpc.py", line 49, in status
return self._send_cmd('status')
File "/opt/minemeld/engine/0.9.70.post1/local/lib/python2.7/site-packages/minemeld/flask/mmrpc.py", line 45, in _send_cmd
timeout=500.0
File "/opt/minemeld/engine/0.9.70.post1/local/lib/python2.7/site-packages/minemeld/comm/zmqredis.py", line 695, in send_rpc
raise RuntimeError('Timeout in RPC')
RuntimeError: Timeout in RPC
[2020-10-14 11:35:30 CEST] [1671] [DEBUG] redis session connection pool: in use: 0 available: 5

Doing some debugging it seems that the function returns 0 bytes from the socket.poll method in the file /opt/minemeld/engine/0.9.70.post1/local/lib/python2.7/site-packages/minemeld/comm/zmqredis.py line 686:

if timeout is not None:
    # zmq green does not support RCVTIMEO
     retcode = socket.poll(flags=zmq.POLLIN, timeout=int(timeout*1000))
     LOG.debug("{}".format(retcode)) # i added this one, resulting to 0 output
     if retcode != 0:
         result = socket.recv_json(flags=zmq.NOBLOCK)

     else:
         socket.close(linger=0)
         raise RuntimeError('Timeout in RPC')

else:
     result = socket.recv_json()

the Timeout RPC exception is raised, but it's not actually an RPC issue as i've tried to modify it on to 500 seconds, file /opt/minemeld/engine/0.9.70.post1/local/lib/python2.7/site-packages/minemeld/flask/mmrpc.py line 45.

Maybe the issue is related to the MISP extensions as well as CIF and TAXII prototypes to pull in feeds.

 

Till now i've tried the following to address the issue:

  • Allocate more resources to the server,
  • Set kernel parameter overcommit memory to 1,
  • Modify the timeout RPC Redis parameter in the code to 500 seconds,
  • Add confidence scores to our MISP indicators imported by the extension

my next test is to reduce the number of indicators in the taxii output feed since is maxed out (1000000 indicators).

Please, if you have any suggestions or you can help me it would be great,

 

Thanks for your attention,

V.E.

4 REPLIES 4

L1 Bithead

Hi,

we've resolved the issue by reducing the number of IoCs exposed from the TAXII feed since it was maxed out. After this no problems were noted. We set an aging filter on the MISP miner.

 

Thanks,

V.E.

L1 Bithead

Hi,

 

I believe your problem is much simpler than adjusting the underlying mech of the redis server - although it's a good read.

 

net.core.somaxconn=1024 - this can have some perfomance benefits. Look up Redis optimizations.

 

The real issue is though that you probably installed Minemeld on an unsupported *nix distribution.

 

If you are getting segfaults in your dmesg from the Minemeld apps, that is probably what's causing all the crashes as it breaks the IPC pipes (the communication between the different components) - library compatibility basically.

 

For more information or details, check this post:

 

https://live.paloaltonetworks.com/t5/minemeld-discussions/minemeld-crash-once-in-a-while/m-p/329765#...

 

Thanks for the pointers regardless,

 

Dimi

 

 

Hi and thanks for the reply and concern,

you're right, i completely bypassed the Redis optimization, i'll do some research about it. If you have any additional resource you can share that might help.

About the unsupported *nix distribution we're using an Ubuntu 16.04.6 LTS OS; what i'm missing are the official supported OS by minemeld and its reccomended requirements. Also is the minemeld project still active and/or supported?

 

Best,

V.E.

We had a similar issue with RPC timeouts but turned out to be the application components crashing.

 

The distro was pushed from Ubuntu 16 to 18 which has a different set of libraries causing issues with the Minemeld apps.

 

I read a couple of other LIVE posts that advise to use the docker instance instead of the bare metal install on Ubuntu.

 

No idea of the project is still active, but I have seen a few minor commits recently although not sure if released.

 

There is no commercial support for the product as far as I am aware.

 

Thanks,

 

Dimi

 

 

  • 4528 Views
  • 4 replies
  • 0 Likes
Like what you see?

Show your appreciation!

Click Like if a post is helpful to you or if you just want to show your support.

Click Accept as Solution to acknowledge that the answer to your question has been provided.

The button appears next to the replies on topics you’ve started. The member who gave the solution and all future visitors to this topic will appreciate it!

These simple actions take just seconds of your time, but go a long way in showing appreciation for community members and the LIVEcommunity as a whole!

The LIVEcommunity thanks you for your participation!