Minemeld Error After Period

cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Announcements

Content translations are temporarily unavailable due to site maintenance. We apologize for any inconvenience. Visit our blog to learn more.

Minemeld Error After Period

L4 Transporter

We've installed MM on Ubuntu 14.04 and everything starts and seems to work OK initially.

 

However, after a period of time it seems tro crash.  Not really sure how log, but as an example I booted yesterday used if fine for an hour or so, and this morning it had failed. 

 

A typical error (top right in red box) would be ERROR RETRIEVING MINEMELD CONFIG: Internal Server Error. - see screenshot attachment.

 

If I restart the minemeld service everything starts and all is good again for a period of time.  Nothing jumps out in the logs - is there any advice you can give on things to check?

 

Thanks

27 REPLIES 27

Here you go.

 

Hopefully this is even better - we have 2 instances of MM running (we're looking to "make" a HA pair).  I have done this for both, the first "MM1" is working, the second "MM2" is currently down,so we can see the difference.

 

<Added as ZIP as text was too long for a post>.

 

Rgds

 

 

Attached is output of '$ top -b -n 1 -o %MEM' after crash. Also 

minemeld@minemeld:~$ df
Filesystem     1K-blocks     Used Available Use% Mounted on
udev             2010792        4   2010788   1% /dev
tmpfs             404472      716    403756   1% /run
/dev/dm-0       31613844 27997008   1987860  94% /
none                   4        0         4   0% /sys/fs/cgroup
none                5120        0      5120   0% /run/lock
none             2022344        0   2022344   0% /run/shm
none              102400        0    102400   0% /run/user
/dev/sda1         240972    40631    187900  18% /boot
minemeld@minemeld:~$ free
             total       used       free     shared    buffers     cached
Mem:       4044688    3761068     283620         52     109372    1190780
-/+ buffers/cache:    2460916    1583772
Swap:      1048572     514636     533936

Hi @apackard,

super useful indeed. 0.9.30 has just been released and it contains a fix for a socket leak in the API process involving session to redis. That seems exactly the issue you are facing.

Could you try upgrading your instances ?

 

Thanks,

lmori

Excellent.

 

Using APT we're showing:-

 

minemeld/stable 0.9.7-8 amd64 [upgradable from: 0.9.7-6]

as the latest version, should I use APT or manually install?

..tried it just incase and looks good for version.  I'll run over the weekend to soak test, many thanks.

 

2016-12-02 15:36:30,296 INFO:0.9.7 Package minemeld-engine current version set to 0.9.30
2016-12-02 15:36:30,299 INFO:0.9.7 Package minemeld-webui current version set to 0.9.30
2016-12-02 15:36:30,301 INFO:0.9.7 Package minemeld-prototypes current version set to 0.9.30

Hi @apackard,

one more question: are you using a monitoring tool to check MineMeld WebUI or a script to periodically refresh the webpage ?

 

luigi

Not that I know of, there may be some discovery tools running on the network I'm not aware of but unlikely.  I'll check the iptables logs on the box to double check.

 

I wondered that when you asked how many devices would be pulling list data down.

 

The number of active sessions in your log between the API process and Redis was really high, even if there was a leak.

Please let me know if you still see the same error in the minemeld-web log file, we can enforce an hard limit on the sessions without comproming usability.

We've now been running for 4 days on both nodes at the latest release and no issues, so looks like you've fixed it.

 

No repeats of the errors in the logs, and re-running the checks (port process lists etc) shows no repeats of the trace evidence we saw before.

 

I'll leave this thread open as someone else had an issue that he may want to comment on to see if this also fixed it, but I'm happy for you to close as the fix for me.

 

Many Thanks

Thanks @apackard for the feedback !

L1 Bithead

FYI - Just wanted the dev's to be aware....

 

I too ran into an error last week with similar ui issues and system locking up. 

 

Turned out to be self inflicted, I was playing around with the API adding indicators. Turns out there is little error checking on the JSON object being added and some objects were added to minemeld incorrectly formatted.

 

No errors were thrown but when I would view objects in the list (https://{minemeld-server}/#/nodes/{list_Name})  ... errors would occur and service restarts are the only way I could restore service. 

 

I would also manually modify the files on the server (deleting the incorrectly formatted entry), this resolved all my problems ... or so I thought.

 

When I accessed the Logs (https://{minemeld-server}/#/logs, the errors came back. Turns out I had to delete the binary log object (I cannot remember the file extension), that contained any entries that included the incorrectly formatted entries. This completely fixed my troubles.

 

I am updating my Poweshell script to do some validation prior to commiting to minemeld

 

Not a very fun adventure... but that's why this is beta.

 

--Sean Engelbrecht

L1 Bithead

I have a question about minemeld.

I have configured in ESXI 1024 gb of memory but there only 10Gb of Memory Active.

Is it Okay if i reduce the memory size.

 

Please do let me know if any know the answer for this

Cyber Elite
Cyber Elite

@Rajendra-S,

You've configured a MineMeld server with over a Terabyte of memory? That's vastly over provisioned and I'd highly recommend you lower it. Even 10GB seems excessive for a MineMeld instance; all of my nodes run with 2GBs of memory and I don't encounter any issues at all. 

  • 24332 Views
  • 27 replies
  • 0 Likes
Like what you see?

Show your appreciation!

Click Like if a post is helpful to you or if you just want to show your support.

Click Accept as Solution to acknowledge that the answer to your question has been provided.

The button appears next to the replies on topics you’ve started. The member who gave the solution and all future visitors to this topic will appreciate it!

These simple actions take just seconds of your time, but go a long way in showing appreciation for community members and the LIVEcommunity as a whole!

The LIVEcommunity thanks you for your participation!