- Access exclusive content
- Connect with peers
- Share your expertise
- Find support resources
04-18-2016 07:58 AM
Today, the stdlib.aggregatorURL aggregator processes a list of URLs, removes duplicates, and manages withdrawals/whitelists. However, no optimization is performed on the output of this aggregator. I would like to recommend the following enhancements:
1. Removal of superfluous URLs
URLs that are made redundant by shorter, wildcard URLs should be removed from the output list.
Example:
*.domain.com
subdomain.domain.com <-- REMOVE
host.subdomain.domain.com <-- REMOVE
2. Convert to lowercase before removing duplicate URLs
The aggregator output today could include duplicate URLs containing mixed case letters. This can be addressed by converting all URL strings to lowercase before the removal of duplicates.
Example:
login.microsoftonline.com
Login.microsoftonline.com
3. Sorted output
This one is more cosmetic in nature, but it will help users when troubleshooting. Sorting the aggregator output will help save time when firewall adminsitrators need to look up URLs received in an EDL. Today, the output is not sorted.
04-19-2016 01:36 AM
1) and 2) should be supported by a URL-specific aggregator. ER#9 has been open to track this.
Output list are sorted indeed, but by default they are sorted based on the time of update. Most recent entries at the top of the list.
04-19-2016 04:55 AM
Not to derail the original post, but are all the output lists (IPv4 specifically) sorted with the newest ones at the top? I ask as I am trying to solve for how to handle the lower EBL counts on the PA-500 and PA-3020.
04-19-2016 06:24 AM
Hi greg.rohel,
yes, same applies to all the list generated by Output nodes of class RedisSet (all the "feed*" prototypes are based on this class).
To cope with platforms limits you can split the lists in multiple sublists using the s and n URL parameters:
Examples:
- topmost 1000 elements of the list
https://<minemeld>/feeds/feed1?n=1000
- elements 1000-2000 of the list
https://<minemeld>/feeds/feed1?s=1000&n=1000
The indicator value used to sort the list is configurable by changing the prototype (and this will be possible in the next release of MM). The parameter is called scoring_attribute and by default it is set to last_seen.
Click Accept as Solution to acknowledge that the answer to your question has been provided.
The button appears next to the replies on topics you’ve started. The member who gave the solution and all future visitors to this topic will appreciate it!
These simple actions take just seconds of your time, but go a long way in showing appreciation for community members and the LIVEcommunity as a whole!
The LIVEcommunity thanks you for your participation!