I am having issues setting up a prototype within Minemeld to pull correctly pull values from an external XML URL feed. The issue is with the indicator regex
The data is provided in XML like this:
<uri>https://example.com</uri><type>combo</type><pubDate>Wed, Nov 14 2019 03:30:03 UTC</pubDate><guid>NA</guid></item><item><action>ADD</action>
<uri>https://example2.com</uri><type>combo</type><pubDate>Wed, Nov 13 2019 03:35:02 UTC</pubDate><guid>NA</guid></item><item><action>ADD</action>
The default indicator feed is:
The aim is to read all the website values between the URI tags, however, this doesn't parse correctly and in the format above provides one result no matter how many entries are in the XML, returning all text between the first <uri> tag and the closing </uri> tag.
I was able to recreate this issue after testing this on an online regex website and it seems the solution is to add the ungreedy and global modifiers to the regex so it would like something like this:
regex: <uri>(.*?)</uri> /g
However, when I put this format into the indicators config in the Minemeld prototype I get 0 entries returned in the output. I think I am just formatting the regex wrong. Is anyone able to advise on the correct formatting to get Minemeld to accept these global modifiers?
Solved! Go to Solution.
If I hazard a guess, would I be right in saying you're having trouble with the AusCERT prototype?
If so, I had the same problem, came up with the same fix (i.e. mark the regex as ungreedy), become confused when it didn't work, searched Google trying to find Minemeld's handling of regexes, found nothing, then ending up finding this thread.
Assuming you're having the same problem, I came up with a workaround since there didn't seem to be any documentation on how this is supposed to be handled.
Modify the auscert.yml prototype, and replace 'xml' with 'txt' in all the URLs. Then, comment out the 'indicator:' section of each prototype.
For example, change:
7days_combo: author: Simon Coggins development_status: STABLE node_type: miner indicator_types: [ URL ] tags: - ConfidenceHigh - ShareLevelRed description: 7 days combo config: age_out: default: null sudden_death: true source_name: auscert.7days_combo url: https://www.auscert.org.au/api/v1/malurl/combo-7-xml indicator: regex: '<uri>(.*)</uri>' transform: '\1' attributes: type: URL share_level: red confidence: 80 class: minemeld.ft.auscert.MaliciousURLFeed
7days_combo: author: Simon Coggins development_status: STABLE node_type: miner indicator_types: [ URL ] tags: - ConfidenceHigh - ShareLevelRed description: 7 days combo config: age_out: default: null sudden_death: true source_name: auscert.7days_combo url: https://www.auscert.org.au/api/v1/malurl/combo-7-txt attributes: type: URL share_level: red confidence: 80 class: minemeld.ft.auscert.MaliciousURLFeed
Then just restart the Minemeld process.
If I've deduced incorrectly and your problems have nothing to do with AusCERT, then please ignore this post...
Thanks Aisherwood, indeed this was for the AusCERT feed and your workaround has worked for us too.
Appreciate you coming back to share the solution!
Click Accept as Solution to acknowledge that the answer to your question has been provided.
The button appears next to the replies on topics you’ve started. The member who gave the solution and all future visitors to this topic will appreciate it!
These simple actions take just seconds of your time, but go a long way in showing appreciation for community members and the Live Community as a whole!
The Live Community thanks you for your participation!