URLHaus complete list help

cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Announcements

URLHaus complete list help

L2 Linker

I am trying to pull the complete list from URLHaus (https://urlhaus.abuse.ch/api/) and specificly the CSV feed. (https://urlhaus.abuse.ch/downloads/csv/)

The problem is this. The feed is huge! Over 200k right now, so the PAN will not take it because of it's limits. 

 

I noticed that most of these URL's are marked offline. So there are less than 5k of them on line. This is good.

 

My problem is this. How do I get a miner to only grab the online URL's? Has anyone else done this?

 

1 accepted solution

Accepted Solutions

L7 Applicator

Hi @Mattk,

you can do this:

- miner based on CSV class

- use ignore_regex to ignore lines with "offline" status

 

Example:

2019-08-14_15-07-49.png

 

Prototype config:

attributes:
    confidence: 80
    share_level: green
    type: URL
fieldnames:
- urlhaus_id
- urlhaus_dateadded
- indicator
- urlhaus_url_status
- urlhaus_threat
- urlhaus_tags
- urlhaus_link
- urlhaus_reporter
ignore_regex: (?:^#)|(?:.*,"offline",)
interval: 300
source_name: urlhaus.csv
url: https://urlhaus.abuse.ch/downloads/csv/

View solution in original post

2 REPLIES 2

L7 Applicator

Hi @Mattk,

you can do this:

- miner based on CSV class

- use ignore_regex to ignore lines with "offline" status

 

Example:

2019-08-14_15-07-49.png

 

Prototype config:

attributes:
    confidence: 80
    share_level: green
    type: URL
fieldnames:
- urlhaus_id
- urlhaus_dateadded
- indicator
- urlhaus_url_status
- urlhaus_threat
- urlhaus_tags
- urlhaus_link
- urlhaus_reporter
ignore_regex: (?:^#)|(?:.*,"offline",)
interval: 300
source_name: urlhaus.csv
url: https://urlhaus.abuse.ch/downloads/csv/

You are awesome! Thank you so much!

  • 1 accepted solution
  • 4926 Views
  • 2 replies
  • 0 Likes
Like what you see?

Show your appreciation!

Click Like if a post is helpful to you or if you just want to show your support.

Click Accept as Solution to acknowledge that the answer to your question has been provided.

The button appears next to the replies on topics you’ve started. The member who gave the solution and all future visitors to this topic will appreciate it!

These simple actions take just seconds of your time, but go a long way in showing appreciation for community members and the LIVEcommunity as a whole!

The LIVEcommunity thanks you for your participation!