I have an issue where we are seeing some strange issues with VoIP traffic.
S/W Version: 4.1.9
VoIP Provider: Foehn IP Telephone systems.
Latest Application version.
A new VoIP system has been deployed which has SIP traffic passing through the PA-2020.
Application override policies setup for incoming and outgoing SIP traffic with applications set to unknown-tcp and unknown-udp (as per a PAN document/recommendations). VoIP traffic passes through the PA-2020 and reaches the server on the external side for communication.
Everytime there is an internet outage or power loss (which is happening fairly regularly), we lose connection to the VoIP server which is obvious. Upon restoring internet connection, I am unable to contact the VoIP server thereby no telephone connections are established. No SIP traffic passes through. But at the very same time, if I bypass Palo Alto, everything works absolutely fine.
Strangely, what I've noticed is that when the PA-2020 is rebooted immediately after the internet connection is restored, the SIP traffic passes through the Palo Alto absolutely fine. And if the PA is not rebooted and traffic is left to bypass the PA for 30mins or so and upon plugging the PA back in-line, everything seems to work fine with SIP traffic.
Hence the issue is everytime we lose internet connection, we either have to reboot the PA-2020 or completely bypass it for the SIP traffic as it is an educational environment and phones are critical. This is now a very strange issue for me to troubleshoot.
Can anyone please shed some light on this or share your experiences with SIP traffic.
I still have the case open with support and they are currently researching the issue. I sent them over my techsupport files and they were able to reproduce the problem on their lab setup.
Currently, this is what I have suggested my colleagues to do:
1. Every time the see an ISP failover, login to CLI session and issue the following:
a. show session all filter application sip
b. show session all filter application unknown-udp
Now if all your phones are already registered with your provider, they should all show up on these two commands. It would be nice to have the firewall accurately detect the SIP traffic instead of classifying it as unknown-udp. Now to clear the sessions, all you have to do is issue:
a. clear session all filter application sip
b. clear session all filter application unknown-udp
In regards to performance - This will highly depend on the number of phones you have. Simply put, the firewall would have to process approximately 'N' new sessions every 'X' seconds where 'N' is the total number of phones you have and 'X' is the SIP registration interval. Now, I wouldn't try and do this either as again this is not a scalable solution.
I am trying to tackle this from a different perspective. If you notice not all customers who have VoIP phones and using PAN as their firewall are not having this issue (If everyone did, there would be an uproar). So it must be something very specific tied to your VoIP provider or your Phone manufacturer.
For example, we use Polycom phones with Vocalocity as our provider. Using Wireshark I observed that, when the phone is not currently on a call, a SIP registration packet is sent out approximately every 15 seconds. I also monitored this phone's session on the firewall itself and confirmed that the TTL is reset every 15 seconds approximately. Now, I logged in to the phone, and I have an option called NAT keep-alive under network settings that is set to 15 seconds. So, I am currently working with my VoIP provider to see if we can make changes to our phone configuration packages.
However, at the end of the day, PAN should clear the sessions if there is a failover. I cannot think of a reason why the firewall would not want to do that. I hope we will see a solution to this in the next release scheduled in February.
I have some news (bad or good - it's depends)
My problem was finally recognized. In short words:
There is a certain counter 'ctd_tdb_changed' that can be triggered during content / AV upgrade which will cause long lived SIP sessions to switch from 'layer7 processing : enabled' to 'layer7 processing : completed'. This can be viewed in 'show session id x' output for the sip session.
Once the SIP session is 'completed' then ALG/predict session will not function properly.
BUT it may be fixed in 6.x PAN !!! (according to actual informations)
Please ask your local Sales SE to force this fix to be able in 5.0.x
Click Accept as Solution to acknowledge that the answer to your question has been provided.
The button appears next to the replies on topics you’ve started. The member who gave the solution and all future visitors to this topic will appreciate it!
These simple actions take just seconds of your time, but go a long way in showing appreciation for community members and the LIVEcommunity as a whole!
The LIVEcommunity thanks you for your participation!