I wrote some of the comments I received from PA earlier (see the post from 6 jan) at which might answer some of your questions. As I understand the app-id cache problem is basically two (well three depending on how you count) folded. For an existing session that gets offloaded the app-id cache value is a hint regarding "how was this session identified last time this box saw a packet which belongs to this session?". This gives for example that app-id cache says "sip". The packet, because of the value in app-id cache, will then be sent to the sip-decoder. Here is the second problem, the sip decoder didnt successfully identify that this packet no longer is a "sip-packet". And since PA is a NGFW and not a ProxyFW it means that if nothing is identified being wrong with the packet (or rule saying it should be dropped or so) the packet will be forwarded in its original state. And here comes the third problem (but might not count since its out of range for the PaloAlto device) - the server at dstip will interpret the bad packet and on its own handle it as a http request instead of a sip request. I think one way to imagine this is if you use ftp to upload a file which contains a full http header. Now assume that the file (with the http header) accidently turns up in its own packet. So if you only look at the packet itself you will only see a full http header. If PA didnt have any form of app-id cache you would most likely get a false positive where the NGFW would identify this particular packet as app-id:web-browsing (or similar) instead of app-id:ftp. Now regarding the ports the PA will intially work (in its flow) as a regular SPI based firewall. Meaning if srcip, srcport, dstip, dstport isnt matched by any rule the packet will be dropped. Now if you use "service:any" this check will never drop any packets, compared to if you specify "service:application-default" or for that matter "service:TCP12345". This gives that as long as you specify dstport (named service in PA) you will in most cases fix the third problem described above. For example if your rule allows appid:ssh service:application-default (meaning only TCP22 will be allowed) which also gives that the server at dstip must understand http if arrived at TCP22. This will of course not stop any manual evasive behaviour (a user who setup a http server at home that listens to TCP22 and accepts a few bad initial packets before it will find the http header to act on) but it will stop if you try to first initiate a sip session towards facebook TCP80 (or TCP443) servers if your allowing rule only allows TCP5060 and TCP5060 according to applipedia (was looking at appid:sip). Except for the above (knowing which decoder the packet should be sent to which according to PA gives a boost of up to 5% in performance) the app-id cache is also used for heuristics. I assume that is if a session in the first packet is identified as dns, the second packet http and the third as ssh then there is a signature for this behaviour aswell and the PA could classify this session as app-id:xxx. There is an description in regarding the new app-id settings found in 5.0.2 (and soon to be in 4.0 and 4.1 as I have been told) to fix the app-id cache polution.
... View more