SIP Registration Timeout Settings for High Availability

In setting up a high available telephony system most worry about the back end and ensure it functions as they would expect and require.  However one highly visible user issue I have seen is a misconfiguration of the connected SIP phones in regards to the registration timeouts.  When these are very high on your SIP phone then it may not notice a service has moved (via IP/DNS/etc changes) due to a HA switchover and can potentially miss incoming calls until it does.  Typically an outgoing call attempt will work or at the very least cause a registration attempt to the new server the service has moved to.

For example take a look at Aastra, their defaults in a few models I’ve seen are at a half hour for a failed registration.

aastra-default-configuration-reg-failed-retry

If the failed registration timeout is half an hour and the phone attempts to re-register and fails your phone will show an error or unregistered for the next half hour.  This can happen in the cases where the registration comes in as a box is failing or a failover happened and the configuration is being written/updated due to the switchover process.  More reasonable set of values are shown in the following.

aastra-configuration-reg-failed-retry

In this one I’ve lowered the registration failed retry and also the timeout retry timers.  These will make the SIP phone resolve the registration issue quicker by retrying more often than the defaults.  They could be lower depending on the situation.

One precaution before everyone sets these very low.  These settings should be set appropriately when the SIP phone is off-site and there are protections, for example Fail2Ban, in place to block brute force attacks.  In these cases where the SIP phone is on an app on a mobile device this failed registration timeout should be set high enough to not trigger a lockout of a valid device.  If the devices are in-house or IPs can be whitelisted then the values can be lower without worry.

2 Replies to “SIP Registration Timeout Settings for High Availability”

  1. In a few cases where it’s been infeasible to modify the settings on each phone, I’ve set qualify=yes on the extensions. That seems to work so far. Which method would be preferred, or are we indifferent, or is it fine to do both?

    1. Setting qualify=yes on the server enables a SIP Options to check if the phone is online and reachable. It’s useful for mobile devices (SIP phone on a cellphone or laptop for example) and I would certainly recommend it in these cases. If the phone does not respond in time the phone can, depending on configuration, be marked as unreachable on the server side.

      The Failed Retry timeout on the phone will still take effect if a registration/renewal attempt fails to reach a server. This could happen as service is switched to another server in an HA system due to failure, maintenance, etc. This will not happen on each switchover to every phone but the larger the system the higher the odds of a few phones catching this and showing an error on the phones.

      They are related as the qualify setting is the server checking the on phone. Where the phone settings from the article is the phone checking the on server. Which means it’s fine to do both and often required.

Comments are closed.