Nodes keep stopping - how to start and keep them started?

Reply
L4 Transporter

Nodes keep stopping - how to start and keep them started?

Just spun up a new Minemeld server and its working however the nodes like to just stop and I am not sure how to get them to start up and stay started. Rebooting will bring everything back up and they wiull be started for about a minute then they all stop (see screenshot). At home I dont have this issue, all the nodes stay running/started and do their polling, etc without issues. I should be seeing waaaaaaay more indicators specially on the alienvault_reputation node. Thoughts?

 

2018-06-04 12_09_24-MineMeld.png

L4 Transporter

Re: Nodes keep stopping - how to start and keep them started?

I am going to reply to my own post here with more info... I have been trying things to get this working and I have discovered the following:

 

* If I remove the alienvault_reputation input from the inboundaggregator then commit the alienvault_reputation miner gathers everything and all the nodes stay up and running (but of course now the EDL will be missing the ~60,000 objects from the alienvault node

 

* If I add the alienvault node back into the aggregator and commit then it tries to poll and stops around 10,000 objects (sometimes less) and then all the nodes grind to a halt again

 

 Here you can see all nodes running and the alienvault poll completes

2018-06-05 07_42_08-MineMeld2.png

 

But only works when it is *not* added to the inboundaggregator (seen here)

2018-06-05 07_42_28-MineMeld_configs.png

 

As soon as I add it to the aggregator and commit all heck breaks loose and it breaks.

 

* At home I have a VM running minemeld with the exact same configuration but do not have this issue at all. The only difference between the minemeld instance at home and this one is at home I am using Ubuntu 14.04 and here in the office we are trying to get this going using CentOS 7 so I installed it using the Ansible recipe.

 

p.s. Maybe I need to update Minemeld? But there seems to be no minemeld update script when installing using Ansible, how does one update Minemeld on an Ansible CentOS install?

L5 Sessionator

Re: Nodes keep stopping - how to start and keep them started?

Hi @hshawn,

 

this looks like a lack of computing resources to me. Any chance to take a look to /opt/minemeld/log/minemeld-engine.log to see if there is any clue there?

 

How does your home computer compares to the one failing to deliver?

 

L4 Transporter

Re: Nodes keep stopping - how to start and keep them started?

Thanks @xhoms,

 

This VM here is about 2.5 times the CPU/RAM and disk space being used in my home lab setup with 2vCPU and 4GB of RAM. Here is the output from the log requested I ran a tail -f after a reboot and then adding the alienvault reputation list to the aggregator. After about a minute or so all nodes are now stopped and dead

 

2018-06-07T09:21:49 (2157)basepoller._huppable_wait INFO: hup is clear: False
2018-06-07T09:22:37 (2146)mgmtbus._merge_status ERROR: old clock: 86 > 82 - dropped
2018-06-07T09:22:37 (2146)mgmtbus._merge_status ERROR: old clock: 23 > 22 - dropped
2018-06-07T09:22:37 (2146)mgmtbus._merge_status ERROR: old clock: 21 > 19 - dropped
Traceback (most recent call last):
File "/opt/minemeld/engine/current/lib/python2.7/site-packages/gevent/greenlet.py", line 327, in run
result = self._run(*self.args, **self.kwargs)
File "/opt/minemeld/engine/core/minemeld/comm/amqp.py", line 561, in _ioloop
conn.drain_events()
File "/opt/minemeld/engine/current/lib/python2.7/site-packages/amqp/connection.py", line 323, in drain_events
return amqp_method(channel, args)
File "/opt/minemeld/engine/current/lib/python2.7/site-packages/amqp/channel.py", line 241, in _close
reply_code, reply_text, (class_id, method_id), ChannelError,
NotFound: Basic.publish: (404) NOT_FOUND - no exchange 'inboundaggregator' in vhost '/'
<Greenlet at 0x7ff0b5cf5f50: <bound method AMQP._ioloop of <minemeld.comm.amqp.AMQP object at 0x7ff0b8218650>>(10)> failed with NotFound

2018-06-07T09:22:49 (2157)amqp._ioloop_failure ERROR: _ioloop_failure: exception in ioloop
Traceback (most recent call last):

Traceback (most recent call last):
File "/opt/minemeld/engine/core/minemeld/comm/amqp.py", line 567, in _ioloop_failure
g.get()
File "/opt/minemeld/engine/current/lib/python2.7/site-packages/gevent/greenlet.py", line 251, in get
raise self._exception
NotFound: Basic.publish: (404) NOT_FOUND - no exchange 'inboundaggregator' in vhost '/'
2018-06-07T09:22:49 (2157)chassis.stop INFO: chassis stop called
2018-06-07T09:22:49 (2157)base.state INFO: spamhaus_EDROP - transitioning to state 8
2018-06-07T09:22:49 (2157)basepoller.stop INFO: spamhaus_EDROP - # indicators: 130
2018-06-07T09:22:49 (2157)base.state INFO: dshield_blocklist - transitioning to state 8
2018-06-07T09:22:49 (2157)basepoller.stop INFO: dshield_blocklist - # indicators: 20
2018-06-07T09:22:52 (2157)base.state INFO: inboundaggregator - transitioning to state 8
2018-06-07T09:22:52 (2157)ipop.stop INFO: inboundaggregator - # indicators: 16761
2018-06-07T09:22:52 (2157)base.state INFO: inboundfeedhc - transitioning to state 8
2018-06-07T09:22:52 (2157)base.state INFO: spamhaus_DROP - transitioning to state 8
2018-06-07T09:22:52 (2157)basepoller.stop INFO: spamhaus_DROP - # indicators: 855
2018-06-07T09:22:52 (2157)base.state INFO: inboundfeedmc - transitioning to state 8
2018-06-07T09:22:52 (2157)base.state INFO: malwaredomainlist_ip - transitioning to state 8
2018-06-07T09:22:52 (2157)basepoller.stop INFO: malwaredomainlist_ip - # indicators: 1011
2018-06-07T09:22:52 (2157)base.state INFO: inboundfeedlc - transitioning to state 8
2018-06-07T09:22:52 (2157)base.state INFO: alienvault_reputation - transitioning to state 8
2018-06-07T09:22:52 (2157)basepoller.stop INFO: alienvault_reputation - # indicators: 52338
2018-06-07T09:22:52 (2157)chassis.stop INFO: Stopping fabric
Traceback (most recent call last):
File "/opt/minemeld/engine/current/lib/python2.7/site-packages/gevent/hub.py", line 544, in switch
switch(value)
File "/opt/minemeld/engine/core/minemeld/comm/amqp.py", line 575, in _ioloop_failure
l()
File "/opt/minemeld/engine/core/minemeld/fabric.py", line 100, in _comm_failure
self.chassis.fabric_failed()
File "/opt/minemeld/engine/core/minemeld/chassis.py", line 181, in fabric_failed
self.stop()
File "/opt/minemeld/engine/core/minemeld/chassis.py", line 217, in stop
self.fabric.stop()
File "/opt/minemeld/engine/core/minemeld/fabric.py", line 112, in stop
self.comm.stop()
File "/opt/minemeld/engine/core/minemeld/comm/amqp.py", line 675, in stop
sc.disconnect()
File "/opt/minemeld/engine/core/minemeld/comm/amqp.py", line 424, in disconnect
self.channel.close()
File "/opt/minemeld/engine/current/lib/python2.7/site-packages/amqp/channel.py", line 176, in close
self._send_method((20, 40), args)
File "/opt/minemeld/engine/current/lib/python2.7/site-packages/amqp/abstract_channel.py", line 56, in _send_method
self.channel_id, method_sig, args, content,
File "/opt/minemeld/engine/current/lib/python2.7/site-packages/amqp/method_framing.py", line 221, in write_method
write_frame(1, channel, payload)
File "/opt/minemeld/engine/current/lib/python2.7/site-packages/amqp/transport.py", line 182, in write_frame
frame_type, channel, size, payload, 0xce,
File "/opt/minemeld/engine/current/lib/python2.7/site-packages/gevent/socket.py", line 460, in sendall
data_sent += self.send(_get_memory(data, data_sent), flags)
File "/opt/minemeld/engine/current/lib/python2.7/site-packages/gevent/socket.py", line 437, in send
return sock.send(data, flags)
error: [Errno 104] Connection reset by peer
<built-in method switch of greenlet.greenlet object at 0x7ff0b5cf5190> failed with error

 

Let me know what you think, thanks!

L4 Transporter

Re: Nodes keep stopping - how to start and keep them started?

More data to process...

 

The issue seems to lie with that alienvault feed. If I remove that I can add all the miners and add them to the aggregator that I want. Right now I have had 10 miners running without issues. If anyone can identify the issue with the alienvault feed I will pop it back in there but for now I am just happy to have some hc and mc lists working

L7 Applicator

Re: Nodes keep stopping - how to start and keep them started?

Hi @hshawn,

are you running MM on CentOS?

 

luigi

L4 Transporter

Re: Nodes keep stopping - how to start and keep them started?

@lmori

 

Yes, CentOS 7

L4 Transporter

Re: Nodes keep stopping - how to start and keep them started?

This weekend I spun up a new CentOS VM and did a fresh install of Minemeld using ansible, then I added the alienvault.reputation feed to it and saw the same behavior I am seeing in Minemeld on the corp network. the alienvault feed starts to poll and then all the nodes die and Minemeld becomes dead weight.

 

This leads me to believe something with the ansible install or with CentOS does not agree or play nice with something in the alienvault.reputation feed.

 

What I can confirm:

* All works great with the Ubuntu OVA setup no issues at all

* Breaks when adding alienvault reputation feed on an ansible installed CentOS setup

* I have reached the end of my troubleshooting abilities on this one, if anyone wants to take a crack at it all the logs and info should be in this thread but I can provide more details if needed

L1 Bithead

Re: Nodes keep stopping - how to start and keep them started?

Hi Shawn,

 

I'm going to try to replicate your MM installation on CentOS 7.

 

Did you use this ansible playbook? 

 

If so, did you keep "minemeld_version: develop" or "Stable" ?

L4 Transporter

Re: Nodes keep stopping - how to start and keep them started?

@tyreed

 

Yes, that is the playbook I used. Fresh install of latest CentOS 7 and used the stable minemeld branch. It should install without issues but you might run into an error with an obsolete package that you need to manually install and then start the ansible install again and it will finish up pretty quickly.

 

Once everything is up and running try adding the alienvault.reputation feed and see how it goes.

 

Good luck! Let us know how it goes

Like what you see?

Show your appreciation!

Click Like if a post is helpful to you or if you just want to show your support.

Click Accept as Solution to acknowledge that the answer to your question has been provided.

The button appears next to the replies on topics you’ve started. The member who gave the solution and all future visitors to this topic will appreciate it!

These simple actions take just seconds of your time, but go a long way in showing appreciation for community members and the Live Community as a whole!

The Live Community thanks you for your participation!