Filter items from source feed

deanm · ‎04-29-2017

One of the feeds I would like to import is the alienvault feed. However, I only want a subset of the IPs listed. I have tried using a regex with a transform to limit the results, but the miner is still showing an indicator count of 54,000.

I cloned the alienvault prototype and changed it to this:

my_alienvaultreputation:

config:

attributes:

confidence: 80

share_level: green

type: IPv4

delimiter: '#'

fieldnames:

- indicator

- alienvault_reliability

- alienvault_risk

- alienvault_type

indicator:

regex: '([0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3})(.*Malicious Host\#(US|CN))'

transform: '\1'

interval: 3600

source_name: alienvault.reputation

url: http://reputation.alienvault.com/reputation.data

description: Malicious US and Canada only alienvault reputation entries

development_status: EXPERIMENTAL

indicator_types:

- IPv4

node_type: miner

tags:

- OSINT

- ShareLevelGreen

The regex itself works, at least in Sublime text when I do a regex search of the alienvault reputation list, which shows approximately 8,000 matches.

Is this not possible or is something wrong with the prototype?

Thanks,

Mike

lmori · ‎05-05-2017

Hi @deanm,

I guess the reason is that filters are applied in order and the first matching is used (it works as a traditional firewall rulebase). In your case your rulebase accepts: indicators in US, indicators in CA and indicators of type "Malicious Host" (even if they are not in US or CA). If you want to use the type as additional selector you should use this:

alienvault_reputation-Malicious_US-CA:
        class: minemeld.ft.csv.CSVFT
        config:
            attributes:
                confidence: 80
                share_level: green
                type: IPv4
            delimiter: '#'
            fieldnames:
            - indicator
            - alienvault_reliability
            - alienvault_risk
            - alienvault_type
            - alienvault_country
            outfilters:
            -   actions:
                - accept
                conditions:
                - alienvault_country == 'US'
                - alienvault_type == 'Malicious Host'
                name: accept US
            -   actions:
                - accept
                conditions:
                - alienvault_country == 'CA'
                - alienvault_type == 'Malicious Host'
                name: accept CA
            -   actions:
                - drop
                name: drop all
            interval: 3600
            source_name: alienvault.reputation
            url: http://reputation.alienvault.com/reputation.data
        description: Malicious US and CA alienvault hosts
        development_status: EXPERIMENTAL
        indicator_types:
        - IPv4
        node_type: miner
        tags:
        - OSINT
        - ShareLevelGreen

Please note that you will still see 55K indicators in the Miner, but only a subset of them should be emitted to the attached processors - you can check the UPDATE.RX counter on the processor to double check this.

View solution in original post

lmori · ‎05-02-2017

Hi @deanm,

CSV Miner does not support regex in indicators, but you can change the prototype with the following config to extract the country from the alienvault data and do not propagate indicators from the Miner if the country is not US or CN:

attributes:
    confidence: 80
    share_level: green
    type: IPv4
delimiter: '#'
fieldnames:
- indicator
- alienvault_reliability
- alienvault_risk
- alienvault_type
- alienvault_country
- alienvault_city
interval: 3600
outfilters:
-   actions:
    - accept
    conditions:
    - alienvault_country == 'CN'
    name: accept CN
-   actions:
    - accept
    conditions:
    - alienvault_country == 'US'
    name: accept US
-   actions:
    - drop
    name: drop all
source_name: alienvault.reputation
url: http://reputation.alienvault.com/reputation.data

deanm · ‎05-03-2017

Thank you for the info, that is great. I am adding it now and will let you know if I have any issues. It does bring up two questions.

Is there a way, other than cloning and modifying, to edit prototypes?

Also, is there a doc somewhere that lists all the fields/options for the prototypes?

Thanks again!

Mike

lmori · ‎05-03-2017

Hi @deanm,

you can also edit the file /opt/minemeld/local/prototypes/minemeldlocal.yml and restart the engine.

An alternative, if you want to share your prototype, is creating a simple external extension with the prototype and share the extension. Example: https://github.com/PaloAltoNetworks/minemeld-cef

The config options available depend on the class of the node, you can find some details here: https://github.com/PaloAltoNetworks/minemeld-core/blob/master/docs/nodeconfig.rst

deanm · ‎05-05-2017

I thought this solved the question, but it does not appear that it did.

The node shows the full 55,000+ indicators, which may be normal with filters, I am not sure, but I am not seeing any addresses that should match the filters in any of my feeds. The full prototype configuration is

alienvault_reputation-Malicious_US-CA:

config:

attributes:

confidence: 80

share_level: green

type: IPv4

delimiter: '#'

fieldnames:

- indicator

- alienvault_reliability

- alienvault_risk

- alienvault_type

- alienvault_country

outfilters:

- actions:

- accept

conditions:

- alienvault_type == 'Malicious Host'

- actions:

- accept

conditions:

- alienvault_country == 'US'

- actions:

- accept

conditions:

- alienvault_country == 'CA'

- actions:

- drop

interval: 3600

source_name: alienvault.reputation

url: http://reputation.alienvault.com/reputation.data

description: Malicious US and CA alienvault hosts

development_status: EXPERIMENTAL

indicator_types:

- IPv4

node_type: miner

tags:

- OSINT

- ShareLevelGreen

I have tried using both infilters and outfilters as well as inbound and outbound feeds.

So far, nothing I have tried has ended up with an appropriate amount of indicators being listed (I check by using a regex search/count on the source alienvault feed).

Any thoughts, I am missing something?

Thanks,

Mike

lmori · ‎05-05-2017

Hi @deanm,

I guess the reason is that filters are applied in order and the first matching is used (it works as a traditional firewall rulebase). In your case your rulebase accepts: indicators in US, indicators in CA and indicators of type "Malicious Host" (even if they are not in US or CA). If you want to use the type as additional selector you should use this:

alienvault_reputation-Malicious_US-CA:
        class: minemeld.ft.csv.CSVFT
        config:
            attributes:
                confidence: 80
                share_level: green
                type: IPv4
            delimiter: '#'
            fieldnames:
            - indicator
            - alienvault_reliability
            - alienvault_risk
            - alienvault_type
            - alienvault_country
            outfilters:
            -   actions:
                - accept
                conditions:
                - alienvault_country == 'US'
                - alienvault_type == 'Malicious Host'
                name: accept US
            -   actions:
                - accept
                conditions:
                - alienvault_country == 'CA'
                - alienvault_type == 'Malicious Host'
                name: accept CA
            -   actions:
                - drop
                name: drop all
            interval: 3600
            source_name: alienvault.reputation
            url: http://reputation.alienvault.com/reputation.data
        description: Malicious US and CA alienvault hosts
        development_status: EXPERIMENTAL
        indicator_types:
        - IPv4
        node_type: miner
        tags:
        - OSINT
        - ShareLevelGreen

Please note that you will still see 55K indicators in the Miner, but only a subset of them should be emitted to the attached processors - you can check the UPDATE.RX counter on the processor to double check this.

deanm · ‎05-05-2017

Of course, that makes sense. It seems to be working now, although the TX number is lower than the number I find when I do a regex search of the source file.

The main feed definition does not define inbound or outbound. I am assuming it will default to inbound. Is that assumption correct?

Thanks again for all your help!

lmori · ‎05-07-2017

Hi @deanm,

I have double checked the conditions, and you should change the config of the prototype to apply a more flexible match on the alienvault_type:

attributes:
    confidence: 80
    share_level: green
    type: IPv4
delimiter: '#'
fieldnames:
- indicator
- alienvault_reliability
- alienvault_risk
- alienvault_type
- alienvault_country
interval: 3600
outfilters:
-   actions:
    - accept
    conditions:
    - alienvault_country == 'US'
    - contains(alienvault_type, 'Malicious Host') == true
    name: accept US
-   actions:
    - accept
    conditions:
    - alienvault_country == 'CA'
    - contains(alienvault_type, 'Malicious Host') == true
    name: accept CA
-   actions:
    - drop
    name: drop all
source_name: alienvault.reputation
url: http://reputation.alienvault.com/reputation.data

About inbound, outbound: with this prototype IPv4 generated have no direction settings. This means you can use the Miner with both Inbound and Outbound IPv4 processors.

deanm · ‎05-07-2017

How does the contains change things? It did add more, but this time too many (go figure).

Is there a way to see what it is matching?

As always, thanks for the help!

And how did you figure out all of the commands?

lmori · ‎05-09-2017

Hi @deanm,

before the filter were selecting only indicators with alienvault_type equals to "Malicious Host", now indicator where the alienvaule_type contains "Malicious Host". Some indicators have a string with multiple alienvault_types, like "Spamming Host;Malicious Host;FooBar". Your regex was selecting only the indicators where Malicious Host was at the end of the type field, the filters instead also select indicators where Malicious Host is in the middle or at the beginning of the type string. That's why you have more than expected.

You can see what is matching by looking at the logs of the processor node connected to the Miner and check for RECVD_UPDATE messages.

I know the commands because I wrote the code 🙂 Seriously, you can find more details here: https://github.com/PaloAltoNetworks/minemeld-core/blob/master/docs/nodeconfig.rst

Thanks,

luigi

deanm · ‎05-10-2017

Thanks again and that definitely makes sense on why you are so versed in the commands :-).

The tool is great, it has made my job of managing my EDLs much, much easier (I am testing it at home and will deploy at work soon).

Is there a way to validate the number of addresses matched? Meaning, if I look at the current reputation.data file and use a regex such as

"egrep -ce '(\d{1,3}\.){3}\d{1,3}\#\d{1,}#\d{1,}\#.*Malicious.*\#(CA|US)' reputation.data"

and that returns a count of 3356, should I see a similar or same number in Minemeld's Update TX counter for that miner? At least, for the first run after an engine start

lmori · ‎05-11-2017

Hi @deanm,

the best way to validate is connecting an IPv4 processor node to the Miner and then check the UPDATE.RX counter on the processor node. In 0.9.38 the UPDATE.TX counter on the Miner is incremented even if the indicator is dropped by the output filters, this will be fixed in the next release.

Luigi

Unlock your full community experience!

Filter items from source feed

Filter items from source feed

Show your appreciation!