Filter items from source feed

cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

Filter items from source feed

L2 Linker

One of the feeds I would like to import is the alienvault feed.  However, I only want a subset of the IPs listed.  I have tried using a regex with a transform to limit the results, but the miner is still showing an indicator count of 54,000.

 

I cloned the alienvault prototype and changed it to this:

 

    my_alienvaultreputation:

        class: minemeld.ft.csv.CSVFT

        config:

            attributes:

                confidence: 80

                share_level: green

                type: IPv4

            delimiter: '#'

            fieldnames:

            - indicator

            - alienvault_reliability

            - alienvault_risk

            - alienvault_type

            indicator:

                regex: '([0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3})(.*Malicious Host\#(US|CN))'

                transform: '\1'

            interval: 3600

            source_name: alienvault.reputation

            url: http://reputation.alienvault.com/reputation.data

        description: Malicious US and Canada only alienvault reputation entries

        development_status: EXPERIMENTAL

        indicator_types:

        - IPv4

        node_type: miner

        tags:

        - OSINT

        - ShareLevelGreen

   

The regex itself works, at least in Sublime text when I do a regex search of the alienvault reputation list, which shows approximately 8,000 matches.

 

Is this not possible or is something wrong with the prototype?

 

Thanks,

 

Mike

1 accepted solution

Accepted Solutions

Hi @deanm,

I guess the reason is that filters are applied in order and the first matching is used (it works as a traditional firewall rulebase). In your case your rulebase accepts: indicators in US, indicators in CA and indicators of type "Malicious Host" (even if they are not in US or CA). If you want to use the type as additional selector you should use this:

alienvault_reputation-Malicious_US-CA:
        class: minemeld.ft.csv.CSVFT
        config:
            attributes:
                confidence: 80
                share_level: green
                type: IPv4
            delimiter: '#'
            fieldnames:
            - indicator
            - alienvault_reliability
            - alienvault_risk
            - alienvault_type
            - alienvault_country
            outfilters:
            -   actions:
                - accept
                conditions:
                - alienvault_country == 'US'
- alienvault_type == 'Malicious Host'
name: accept US - actions: - accept conditions: - alienvault_country == 'CA'
- alienvault_type == 'Malicious Host' name: accept CA - actions: - drop name: drop all interval: 3600 source_name: alienvault.reputation url: http://reputation.alienvault.com/reputation.data description: Malicious US and CA alienvault hosts development_status: EXPERIMENTAL indicator_types: - IPv4 node_type: miner tags: - OSINT - ShareLevelGreen

 

Please note that you will still see 55K indicators in the Miner, but only a subset of them should be emitted to the attached processors - you can check the UPDATE.RX counter on the processor to double check this.

View solution in original post

11 REPLIES 11

L7 Applicator

Hi @deanm,

CSV Miner does not support regex in indicators, but you can change the prototype with the following config to extract the country from the alienvault data and do not propagate indicators from the Miner if the country is not US or CN:

 

attributes:
    confidence: 80
    share_level: green
    type: IPv4
delimiter: '#'
fieldnames:
- indicator
- alienvault_reliability
- alienvault_risk
- alienvault_type
- alienvault_country
- alienvault_city
interval: 3600
outfilters:
-   actions:
    - accept
    conditions:
    - alienvault_country == 'CN'
    name: accept CN
-   actions:
    - accept
    conditions:
    - alienvault_country == 'US'
    name: accept US
-   actions:
    - drop
    name: drop all
source_name: alienvault.reputation
url: http://reputation.alienvault.com/reputation.data

Thank you for the info, that is great.  I am adding it now and will let you know if I have any issues.  It does bring up two questions.

 

Is there a way, other than cloning and modifying, to edit prototypes?

 

Also, is there a doc somewhere that lists all the fields/options for the prototypes?

 

Thanks again!

 

Mike

Hi @deanm,

you can also edit the file /opt/minemeld/local/prototypes/minemeldlocal.yml and restart the engine.

An alternative, if you want to share your prototype, is creating a simple external extension with the prototype and share the extension. Example: https://github.com/PaloAltoNetworks/minemeld-cef

 

The config options available depend on the class of the node, you can find some details here: https://github.com/PaloAltoNetworks/minemeld-core/blob/master/docs/nodeconfig.rst

 

I thought this solved the question, but it does not appear that it did.

 

The node shows the full 55,000+ indicators, which may be normal with filters, I am not sure, but I am not seeing any addresses that should match the filters in any of my feeds.  The full prototype configuration is

 

    alienvault_reputation-Malicious_US-CA:

        class: minemeld.ft.csv.CSVFT

        config:

            attributes:

                confidence: 80

                share_level: green

                type: IPv4

            delimiter: '#'

            fieldnames:

            - indicator

            - alienvault_reliability

            - alienvault_risk

            - alienvault_type

            - alienvault_country

            outfilters:

            -   actions:

                - accept

                conditions:

                - alienvault_type == 'Malicious Host'

                name: accept Malicious

            -   actions:

                - accept

                conditions:

                - alienvault_country == 'US'

                name: accept US

            -   actions:

                - accept

                conditions:

                - alienvault_country == 'CA'

                name: accept CA

            -   actions:

                - drop

                name: drop all

            interval: 3600

            source_name: alienvault.reputation

            url: http://reputation.alienvault.com/reputation.data

        description: Malicious US and CA alienvault hosts

        development_status: EXPERIMENTAL

        indicator_types:

        - IPv4

        node_type: miner

        tags:

        - OSINT

        - ShareLevelGreen

 

I have tried using both infilters and outfilters as well as inbound and outbound feeds.

 

So far, nothing I have tried has ended up with an appropriate amount of indicators being listed (I check by using a regex search/count on the source alienvault feed).

 

Any thoughts, I am missing something?

 

Thanks,

 

Mike

Hi @deanm,

I guess the reason is that filters are applied in order and the first matching is used (it works as a traditional firewall rulebase). In your case your rulebase accepts: indicators in US, indicators in CA and indicators of type "Malicious Host" (even if they are not in US or CA). If you want to use the type as additional selector you should use this:

alienvault_reputation-Malicious_US-CA:
        class: minemeld.ft.csv.CSVFT
        config:
            attributes:
                confidence: 80
                share_level: green
                type: IPv4
            delimiter: '#'
            fieldnames:
            - indicator
            - alienvault_reliability
            - alienvault_risk
            - alienvault_type
            - alienvault_country
            outfilters:
            -   actions:
                - accept
                conditions:
                - alienvault_country == 'US'
- alienvault_type == 'Malicious Host'
name: accept US - actions: - accept conditions: - alienvault_country == 'CA'
- alienvault_type == 'Malicious Host' name: accept CA - actions: - drop name: drop all interval: 3600 source_name: alienvault.reputation url: http://reputation.alienvault.com/reputation.data description: Malicious US and CA alienvault hosts development_status: EXPERIMENTAL indicator_types: - IPv4 node_type: miner tags: - OSINT - ShareLevelGreen

 

Please note that you will still see 55K indicators in the Miner, but only a subset of them should be emitted to the attached processors - you can check the UPDATE.RX counter on the processor to double check this.

Of course, that makes sense.  It seems to be working now, although the TX number is lower than the number I find when I do a regex search of the source file.

 

The main feed definition does not define inbound or outbound.  I am assuming it will default to inbound.  Is that assumption correct?

 

Thanks again for all your help!

Hi @deanm,

I have double checked the conditions, and you should change the config of the prototype to apply a more flexible match on the alienvault_type:

attributes:
    confidence: 80
    share_level: green
    type: IPv4
delimiter: '#'
fieldnames:
- indicator
- alienvault_reliability
- alienvault_risk
- alienvault_type
- alienvault_country
interval: 3600
outfilters:
-   actions:
    - accept
    conditions:
    - alienvault_country == 'US'
    - contains(alienvault_type, 'Malicious Host') == true
    name: accept US
-   actions:
    - accept
    conditions:
    - alienvault_country == 'CA'
    - contains(alienvault_type, 'Malicious Host') == true
    name: accept CA
-   actions:
    - drop
    name: drop all
source_name: alienvault.reputation
url: http://reputation.alienvault.com/reputation.data

About inbound, outbound: with this prototype IPv4 generated have no direction settings. This means you can use the Miner with both Inbound and Outbound IPv4 processors.

How does the contains change things?  It did add more, but this time too many (go figure).

 

Is there a way to see what it is matching?

 

As always, thanks for the help!

 

And how did you figure out all of the commands?

Hi @deanm,

before the filter were selecting only indicators with alienvault_type equals to "Malicious Host", now indicator where the alienvaule_type contains "Malicious Host". Some indicators have a string with multiple alienvault_types, like "Spamming Host;Malicious Host;FooBar". Your regex was selecting only the indicators where Malicious Host was at the end of the type field, the filters instead also select indicators where Malicious Host is in the middle or at the beginning of the type string. That's why you have more than expected.

 

You can see what is matching by looking at the logs of the processor node connected to the Miner and check for RECVD_UPDATE messages.

 

I know the commands because I wrote the code 🙂 Seriously, you can find more details here: https://github.com/PaloAltoNetworks/minemeld-core/blob/master/docs/nodeconfig.rst

 

 

Thanks,

luigi

Thanks again and that definitely makes sense on why you are so versed in the commands :-).  

 

The tool is great, it has made my job of managing my EDLs much, much easier (I am testing it at home and will deploy at work soon).

 

Is there a way to validate the number of addresses matched?  Meaning, if I look at the current reputation.data file and use a regex such as

 

"egrep -ce '(\d{1,3}\.){3}\d{1,3}\#\d{1,}#\d{1,}\#.*Malicious.*\#(CA|US)' reputation.data"

 

and that returns a count of 3356, should I see a similar or same number in Minemeld's Update TX counter for that miner?  At least, for the first run after an engine start

 

Hi @deanm,

the best way to validate is connecting an IPv4 processor node to the Miner and then check the UPDATE.RX counter on the processor node. In 0.9.38 the UPDATE.TX counter on the Miner is incremented even if the indicator is dropped by the output filters, this will be fixed in the next release.

 

Luigi 

  • 1 accepted solution
  • 8582 Views
  • 11 replies
  • 0 Likes
Like what you see?

Show your appreciation!

Click Like if a post is helpful to you or if you just want to show your support.

Click Accept as Solution to acknowledge that the answer to your question has been provided.

The button appears next to the replies on topics you’ve started. The member who gave the solution and all future visitors to this topic will appreciate it!

These simple actions take just seconds of your time, but go a long way in showing appreciation for community members and the LIVEcommunity as a whole!

The LIVEcommunity thanks you for your participation!