How does URL Filtering Works

Reply
L2 Linker

How does URL Filtering Works

Hi Guys,

 

I have slight confusion about the working of URL filtering i.e once u define the URL Filtering Profile on any rule how does the URL or any website gets catogorised,i mean how does the PA knows that this website belongs to this category.

 

I know that there is a download of URL filtering DB from PA periodically,so is it the a particular website is compared to this DB and action being taken or i m missing something here to understand.

 

Please need your support to understand this query.

 

Thanks


Accepted Solutions
L7 Applicator

Hi @mahmoodm

 

well, not 'all' :)

 

the periodic download is a 'seed' file with popular URLs (these may not apply to your specific organization) to prepopulate your cache,but the cache, like any cache, is fleeting: entries will time out and get purged

 

the seed file is only there to save you some lookups for very popular websites, but these may only be a fraction of the URLs your organization uses

 

so for URL filtering to work optimally, you will need internet access as eventually most of the lookups will happen on the cloud (the cache is simply there to serve the most popular urls in your organization as only the entries you use frequently will be kept)

Tom Piens - PANgurus.com
Like my answer? check out my book! amazon.com/dp/1789956374

View solution in original post


All Replies
L0 Member

Does this help answer your question?

 

Management plane cache: The seed database is placed into the management plane (MP) cache to provide quick URL lookups. The MP cache will pull more URLs and categories from the PAN-DB core as users access sites that are not currently in the MP cache. If the URL requested by a user is “unknown” to Palo Alto Networks, the URL will be examined, categorized, and implemented as appropriate.

 

Source: https://researchcenter.paloaltonetworks.com/2014/10/web-security-tips-pan-db-works/

L7 Applicator

in short, when a web-browsing session is started, the client will send a get http (or in case of ssl, the server returns a certificate with cn www.website.com)

The firewall will see if it has a local cached category for the url, if it does it will simply apply policy (allow, alert, block, continue). If not it will do a cloud lookup and ask the url filtering cloud for a category on a url

once the category is returned the firewall will use this category to apply a policy and will also put the url+category in a cache so the next time it can retrieve the category from local cache

Tom Piens - PANgurus.com
Like my answer? check out my book! amazon.com/dp/1789956374
L2 Linker

Hi reaper,

 

Thanks for your response.

 

If i understand correctly this means that PA has the info about all the URL's on the internet in its cache...?

 

And if it isnt able to find the category then it will reach out to cloud to resolve the categoty...in this case PA needs to have internet access to query the could.

 

Thanks

L7 Applicator

Hi @mahmoodm

 

well, not 'all' :)

 

the periodic download is a 'seed' file with popular URLs (these may not apply to your specific organization) to prepopulate your cache,but the cache, like any cache, is fleeting: entries will time out and get purged

 

the seed file is only there to save you some lookups for very popular websites, but these may only be a fraction of the URLs your organization uses

 

so for URL filtering to work optimally, you will need internet access as eventually most of the lookups will happen on the cloud (the cache is simply there to serve the most popular urls in your organization as only the entries you use frequently will be kept)

Tom Piens - PANgurus.com
Like my answer? check out my book! amazon.com/dp/1789956374

View solution in original post

L2 Linker

Hi reaper,

 

Thanks for the response,it clears the things now.

 

Actually i didnt find this info documented anywhere.

 

Thanks

Like what you see?

Show your appreciation!

Click Like if a post is helpful to you or if you just want to show your support.

Click Accept as Solution to acknowledge that the answer to your question has been provided.

The button appears next to the replies on topics you’ve started. The member who gave the solution and all future visitors to this topic will appreciate it!

These simple actions take just seconds of your time, but go a long way in showing appreciation for community members and the LIVEcommunity as a whole!

The LIVEcommunity thanks you for your participation!