PBF for YouTube Redirect to external proxy?

Showing results for 
Show  only  | Search instead for 
Did you mean: 

PBF for YouTube Redirect to external proxy?

L1 Bithead

Since URL-Rewrite is still not an option in PanOS (I think only recently they announced that it may be on the roadmap), I've been investigating ways to redirect YouTube traffic to an external proxy server that supports rewrite.  I need to do this to implement the new Youtube for Schools service.

Wondering if this is a workable scenario:

- Setup an address object group using FQDNs of youtube servers (IPs would be hopeless)

- Setup PBF to redirect traffic to this address group out an interface (or just next-hop externally) that passes traffic through an inline proxy that does the required URL rewrite

Worth the trouble to set it up?


L6 Presenter

The tricky part is that there are shitloads of Youtube servers out there so you would need to use an address object with an initial wildcard such as "*.youtube.com" - or if you can figure out which ip addresses youtube uses and setup your PBF to trigger on the destination IP.

The second tricky part is that the FQDN (to my knowledge but this might have been altered) is looked up when you perform the commit and transformed into the ip address which then is being load into the asic/fpga for the dataplane.

So in theory your best option would be to use a PBF based on appid however the admin guide strongly suggest you to NOT do this simply because the initial packets of a flow will not be identified as appid:youtube and once it is the flow is already setup. If your proxy now suddently receives this burst of packets identified as youtube it will start to complain that it didnt receive a SYN packet (if its TCP-traffic) and the drama occurs.

I wonder if it wouldnt be a better option to setup the magic along with a dns-resolver?

So when your clients asks for *.youtube.com you return the ip of the proxy. However your proxy will then need to know which of the youtube-servers it should forward the traffic to so I guess this wont work either.

Cant your clients use your proxy as a forward-proxy instead?

This way clients who didnt setup forward-proxy settings in their browser wont be able to reach youtube (appid:youtube in your PA = deny) while the clients who needs to reach youtube must configure a forward-proxy in their browser?

Yeah - I was afraid the FQDN was not going to resolve on the fly.  I thought they had a refresh mechanism for this now, but it may not have been actual address objects I was thinking of.

I could require all clients to setup a forward proxy, but I'm trying to make this transparent to the end users (not all of the clients are under AD control).  Plus I'd have to deal with all the traffic going to the proxy, which I don't want, just youtube.

I'm liking the idea of a dns-resolver pointing to the proxy.  Why would there be a caveat of worrying about setting YouTube IPs in the proxy?  Only traffic looking for *.youtube.com would hit the proxy, and the proxy should be able to handle the lookups normally to provide the content.  Or am I not understanding it correctly?  Would I be able to accomplish this on the PA itself, or just through setting my internal DNS servers to resolve the youtube.com domain?


Because a webproxy usually can only work in two modes:

* Transparent, meaning that requests that passes it looks like:

GET / HTTP/1.1
Host: www.example.com

which also means that the webproxy on its own doesnt do any dns resolving (since its transparent, the dstip of this flow is already set to the real dstip (in your case the ip of www.youtube.com or whatever)).

* Non-transparent, used for forward-proxy situations which also means that the requests passing it (well sent to it from the clients) looks like:

CONNECT http://www.example.com/ HTTP/1.0

which means that the webproxy will do a dns resolving to find out which ip www.example.com has and then connect to it, which also means that on the inside the packet have dstip = proxyip and on the outside the dstip = ip of www.example.com (at the same time as the payload of the packet is changed from CONNECT http:// into GET / and so on).

The transparent mode can be used for destination-nat situations aswell meaning that the proxy will accept a connection on lets say TCP8000 and then statically forward it to ip of www.example.com TCP80 (statically in terms of that the webproxy doesnt read the actual contents, it knows beforehand what the dstip should be changed into).

Here is the problem (I believe) in your case since www.youtube.com isnt just a single ip and the streaming servers use a bunch of ip ranges.

If Youtube always would use lets say then you could of course in your L3-core just make a static route for so this traffic will be sent to your webbproxy (which acts in transparent mode) while the other traffic would be sent to your PAN device.

Since Google (who owns Youtube) use its own AS (AS15169) you could route just their current ranges through your transparent webbproxy. This way you would for at least 99% probability (or so) send Youtube traffic through your webproxy (with the downside that all other Google related traffic would pass it too).

This could be done through some BGP router magic or by statically put Google (or Youtube) ranges in your L3-core to force them go through your webproxy - downside here (another one) is that AS15169 uses plenty of ranges: http://www.robtex.com/as/as15169.html#bgp

So I think you have found yourself a spot which isnt that easy to get out from in order to make one solution that cover all cases.

To sum it up (in case I misunderstood something):

0) Your demand is that Youtube traffic is sent through your webproxy while all the other traffic is sent through your PAN device.

1) You dont want to send all Internet traffic through your webproxy, my  guess due to performance issues (you would need to get a bigger webproxy  first?).

2) You cant use PBF based on appid since appid is detected after the flow is initated and when PBF then kicks in and sends your Youtube traffic to your webproxy then your webproxy will become a drama queen since the incoming flow is missing a SYN packet.

3) You cant configure the clients to use their browser with forward-proxy settings (so the client on its own would send its traffic through the webproxy when needed) since many of them are single boxes (not part of your AD structure).

4) You cant setup just a few static routes in your L3-core to force Youtube traffic to go through your webproxy (while the other Internet traffic goes through your PAN) since Youtube doesnt use just a few of ip ranges (they use plenty of various IP ranges).

Which gives given the options and your demands option 4 above is the one which will be closest to your needs (with the drawback that other non Youtube traffic (but on the other hand only Google related traffic) will be sent through your webbproxy. You would also need to maintain these L3 routes (like take a peak at the BGP table every now and then) in case Google adds/removes routes. This can somewhat be handled by in your PAN device setup a deny for youtube traffic (this way you would get support cases if something changes in case you didnt already notice this).

Unless I completely misunderstood your case? :smileysilly:

Hopefully someone else in here might have ideas on how to resolve your problem.

No - you've got it pretty clearly - thanks!

Now I just need to work out which is the lesser of all those evils.  I may split the difference - PBF for students/unknown users through a transparent proxy (all traffic), while staff/teachers have normal access.  There are some testing/assessment sites that I wouldn't want to go through the proxy, but I have those as actual IP addresses so making that exception would be easy.

Ideally PA would just get URL-rewrite in the PanOS, and that would solve this for me.  The word I have through my vendor is they are hoping to add forced safesearch to a release late this year.  Hopefully that opens up more generic rewrite capability.

Like what you see?

Show your appreciation!

Click Like if a post is helpful to you or if you just want to show your support.

Click Accept as Solution to acknowledge that the answer to your question has been provided.

The button appears next to the replies on topics you’ve started. The member who gave the solution and all future visitors to this topic will appreciate it!

These simple actions take just seconds of your time, but go a long way in showing appreciation for community members and the LIVEcommunity as a whole!

The LIVEcommunity thanks you for your participation!