Streaming video server disconnecting every 30 seconds

cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

Streaming video server disconnecting every 30 seconds

L4 Transporter

Hi folks.

I'm tearing my hair out with this one, so I'm hoping that someone can point me in the right direction.

We have an installation of the Unreal Streaming Media server running in our DMZ off our Palo Alto firewall. This server is used as a central access point for both receiving and distributing streamed audio and video for business purposes over the internet and internally.

This server is receives streams from another Unreal (http://www.umediaserver.net) product, and is played out by another of their products.

I've had to put in an application override to get this working, because it's not an app that the PAN recognises - so I've stuck it in for the two ports concerned, and applied rules accordingly.

The problem comes when we actually try to USE it.

We can connect the source (encoder) to the streaming server no problems - for exactly 30 seconds.

Then the connection drops. And stays down for another 30 seconds. Then reconnects (it tries to reconnect automatically) for another 30 seconds. Then drops again.

I KNOW this is a firewall issue - I can stream perfectly well INSIDE my network (across different segments, so it's not a routing issue either). There's got to be SOMETHING in the firewall which is breaking this connection so consistently - but I can't figure out what the heck it is!

(As a test, I have completely removed ALL access restrictions on the device in the DMZ - dangerous, yes, I know - and the problem STILL exists).

Can someone point out to me something - anything - which might be causing this 30 second disconnect? It's far too regular to be a random issue - and the 30 seconds sound like some timer somewhere or another, but I damn well can;t figure out WHAT is causing it.

Anyone who points out a solution and is in Sydney I owe a beer to!

Thanks.

1 accepted solution

Accepted Solutions

Does this sound an awful lot like the issue you were seeing with app overrides?

48994High4.1.11Session setup timeouts in 10 seconds when using app-override with offloadingTCP sessions time out after 10 secondsTCP sessions that matched an application override policy were being closed after a few seconds and the packets were being dropped because the application override was being invoked too early in the handshake process, causing the TCP timeout to be set too low.Disable offloading using CLI command "set session offload no"

5.0.3,

4.1.11-h2 ,

4.1.12

View solution in original post

21 REPLIES 21

L3 Networker

Darren, so you are using all three components :  Unreal *client*, Unreal *Media Server*, and Unreal *Live server* ? And the firewall between each leg ?

I'm not sure if it would be of any help, but I am also using this setup apart from using the Unreal Client, I connect to the Media server using the MMS: protocol which works correctly through the firewall and is supported by Windows/VLC/Android(VPlayer).

My experience with the provided client is rather poor (unless you really need authentication which might be an issue with the MMS protocol)

Cheers.

Hi.

We have the Unreal Media Server on a central server inside our DMZ.

We have a number of Unreal Live server instances, depending on client/source requirements, which offload their streams to the server instance.

And we have the Unreal Stream Media Client for the end users who need to listen to the streams.

We don't use MMS - we use the proprietary UMS because we need extremely low latency for our audio sources, and that was the most efficient method available at the time this was all setup.

And now, of course, inertia holds us to using the older stuff instead of getting something using h.264 running - I can't do anything about that, but the PAN devices are breaking the Unreal Streaming, and I don't know why.

Cheers

L6 Presenter

What if you setup your security rule like this?

app: any

service: TCPxxx (or UDPxxx)

This way the firewall ignores which app is being identified and only look for the port(s) being used.

You can also try to check Device -> Setup -> Session Timeouts and if you change the UDP timeout from 30 to lets say 60 see if your dropped sessions ends after 60 seconds instead of 30?

However the session timeouts are basically for idle time so for some reason it seems that your application sends data and then stops (or get stopped) which then the session timeout kicks in to kill the session (if the sessions gets dropped after 60 seconds instead of 30 seconds if you change the above default value).

It actually uses TCP, not UDP.

I'll try the service thing - I removed *all* applications previously (just set it to any), and left the port at (any), so I don't see this doing any good.

I'll have a fiddle with the timeouts tomorrow and see what I can find - right now, I'm rushing to get an alternative going - thank god for AWS!

Thanks.

OK, I've done the service change setup, and it made no difference.

Currently modifying the default TCP timeouts to see if this changes the frequency of dropouts from 30 seconds to 60 seconds.

I've done some PCAP's - I might log a problem with support and see what they can make of it.

By the way, you are using only a single box (or for that matter active/passive with functional HA along with session sync)?

Im thinking if, since this is a tcp-app, you might be using an active/active setup and the session sync for some reason doesnt work then when this flow is switched to go over the other device it gets dropped because this device (PA-box) didnt have this as a valid session.

A workaround for such situations is to disable the tcp-reject-non-syn in zone-protection-profile (often used in assymetric routing situations like active-active setups or if you have two independent PA-devices but routers before and after the PA-cluster which for a single session might take various nexthops)::

+ tcp-reject-non-syn — Reject non-SYN TCP packet for session setup

global — Use global setting

no — Accept non-SYN TCP. Note that allowing non-SYN TCP traffic may prevent file blocking policies

from working as expected in cases where the client and/or server connection is not set after the block

occurs.

yes — Reject non-SYN TCP

hi.

Yes, I'm running a HA pair - but active/passive, not active/active. And there's no asymmetric routing - single next-hop out (failover is managed by VRRP on the external router), single next-hop in (likewise).

I'm not running a zone protection profile on either my outside or DMZ interfaces - are you suggesting I put one in?

Thanks

At least to try to disable the tcp-reject-non-syn, but I dunno - feels more like a longshot Smiley Happy

Would be interresting if you could setup netcat on this server to listen on this port instead of the streamin video server app and then use telnet or putty (telnet) to connect to this netcat and type some text... wait for 35 seconds and type some more...

Just did that (installed netcat on the server), and it exhibited *exactly* the same behaviour - connected, worked for 30 seconds, left it idle and it dropped out.

However, as long as I *kept* typing, it worked. As soon as I stopped and left it alone for a few seconds, the connection was dropped.- and the time was as little as 3 or 4 seconds.

Interestingly, the server still showed the connection as "active" when I ran "netstat -an" after it had timed out - and I couldn't reconnect until I killed the netcat and restarted it.

The *slightest* idle time in the inbound connection results in it being dropped out - but the connection still shows as established in the server's session table.

Weird!

If you could get a full pcap of this thing (outside of the PA - I mean the "old fashioned" way, by setting up a SPAN port on your switch or using an inline tap) that might help with figuring out what the heck the firewall is doing.

That one might be easier said than done - the server is a VM, and the DMZ is trunked to allow for multiple server access - not to mention being at a DC, not in my office - but if all else fails, I'll give that a go.

I've logged a case with PAN support and provided PCAP's off the device along with a metric buttload of other stuff, so we'll see how that goes.

I have, whoever, conclusively proven it's the firewall - I moved the server (remember I said it's a VM?) onto an internal VLAN and repeated the netcat test mentioned above - I could leave it for 10 minutes and come back, and the telnet connection to the listening port was still working. Outside - leave it inactive for 5 seconds and it simply lost the flow through the firewall.

Was this issue ever resolved?  Having a similar problem here.

Thanks

Yes - not really satisfactorily, but enough to get it working.

The issue was a timeout *somewhere* - despite multiple packet captures and WebEX sessions we could never figure out *where* - but it appeared related to a custom app/app override I put in for this service.

Eventually, I stripped it all back to absolute basics - created two new services for the required TCP ports, disabled the app override and custom app, and created one rule which allowed any/any but ONLY on the two ports I created - and it worked.

I have *no* idea why it didn't work with the app override - the created app had the two ports, had extended timeouts, the whole deal - but it refused to work.

Hit me up privately if you want more detail (assuming you're using the Unreal streaming solution), and I'll be happy to pass it on.

So an 'any' app rule with specific services defined worked, while a custom app override didn't work? Weird. Really weird.

You're not doing anything like IPS/Flie blocking/DLP in the rule are you? I've had that one bite me recently... the traffic from a SQL client could make it to the server, but specific responses from the SQL server were being dropped. That one was weird too.

  • 1 accepted solution
  • 11342 Views
  • 21 replies
  • 0 Likes
Like what you see?

Show your appreciation!

Click Like if a post is helpful to you or if you just want to show your support.

Click Accept as Solution to acknowledge that the answer to your question has been provided.

The button appears next to the replies on topics you’ve started. The member who gave the solution and all future visitors to this topic will appreciate it!

These simple actions take just seconds of your time, but go a long way in showing appreciation for community members and the LIVEcommunity as a whole!

The LIVEcommunity thanks you for your participation!