How to Allow Googlebot and other web crawlers through the Palo Alto Networks Firewall

How to Allow Googlebot and other web crawlers through the Palo Alto Networks Firewall

25166
Created On 09/25/18 17:15 PM - Last Modified 06/12/23 18:13 PM


Resolution


Question

How do we Allow Googlebot and other web crawlers through the Palo Alto Networks firewall?

 

What is Googlebot or a Web Crawler?

web crawler is a program that visits web sites and reads their pages and other information in order to create entries for a search engine index.

 

Details

When websites are protected by a Palo Alto Networks firewall, allowing port 80 is enough for Google's web crawlers (spiders) or any other web crawler to access to the website to index the content and add that to search results, but when using applications as part of the security policy, there are more requirements.

 

Answer

To allow Googlebot or any other web crawler through the firewall, in addition to applications already allowed (web-browsing, ping, flash etc.), the 'web-crawler' application needs to be allowed as well. 

In order for 'web-crawler' to work properly, 'web-browsing' also needs to be allowed as well.  See the 'Depends on Applications:' area in the application area pic below.2016-04-19_web-crawl.pngWeb-Crawler detail screen from Objects > Applications

Note: If your security policy needs to restrict the web crawling from a specific web crawler, the admin needs to use the source IP in the security policy. At this time, Palo Alto Networks does not have a separate application for "Googlebot".

 

owner: acamacho



Actions
  • Print
  • Copy Link

    https://knowledgebase.paloaltonetworks.com/KCSArticleDetail?id=kA10g000000ClE8CAK&refURL=http%3A%2F%2Fknowledgebase.paloaltonetworks.com%2FKCSArticleDetail

Choose Language