SIP Phone Configuration the Easy Way

Phones Over SilhouetteFor large installations configuring all the VoIP phones can be a pain. Perhaps you have polycom SIP phones which I have seen in the past delay starting the web configuration interface until after the phone is ready to handle calls. This is great when the phone is in use as the user can use the phone sooner while not waiting on the processes providing the configuration interface, however when configuring a large number of phones it causes delays if manually configuring each phone. Even when the config interface comes up fast it’s still a labour intensive task to go through each phone manually configuring them. During this manual process mistakes can and will, especially given a large enough set of phone be made, some of which may not be initially obvious. For example, if the dtmf mode of the 47th phone configured today left at the default of RFC2833 but the system is using SIP INFO? That phone gets used for a while but then the user goes to use an IVR and has to file a trouble ticket to get it resolved after finding out dtmf isn’t functioning as expected.

The solution to this manual headache and time drain? Provisioning. This is where the phone system will have the configuration for each phone and the phones download their configuration so they are synced with the phone system options.

Provisioning is usually accomplished via a tftp where the phones configuration files reside. At boot the phone gets an IP via the local dhcp server and the dhcp response has an extra configuration setting, usually Option 66, to inform the phone where the tftp server is. From the tftp the phone will generally request a general model configuration file, this can handle firmware upgrades and common settings specific to a model or make of phones. After that file a request is usually made for a file with the MAC ID to differentiate the phones. This MAC ID file is specific to that individual phone and will contain the details of the SIP connections, extensions, and anything else just for that phone.

The Q-Suite supports provisioning a number of SIP phones from various manufactures. With easy support via templating to extend to new models as needed. The administrator only needs to collect and input the MAC ID’s and choose the proper template when configuring the extensions and the generation of the config files will be done for them. With DHCP Option 66 is set properly boot up the phones and they’ll be functional and making calls.

Which Telephony Interface for your Solution: VoIP Gateway VS PCI card?

What is your PBX or CTI solution without an interface to the outside world. One can use VoIP but there are still a lot of systems going the traditional route and connecting to a Telco either via Analog (POTS) or a T1/PRI/E1.  With Asterisk the two main solutions to do this are an internal PCI card or an external Gateway device. Both options will make and receive calls from the telco but which one is better?

I’ll cut to the chase and say PCI cards should not be recommended for High Availability solutions. They can still have their place in a system without HA and where costs are a major factor, but the decision to use them should be made with their limitations in mind.

Using a Gateway device, such as Patton or Audiocodes, provides the following benefits over an internal card:

  • Multiple telephony servers can connect to a single gateway. Which is important for the next two items.
  • With multiple servers connected to a single gateway in a HA solution calls will be routed to the active server(s).
  • Load balancing done at the gateway level in a high volume centers to distributed calls across servers.
  • Independence from a single server. If a specific server needs to be rebooted or taken offline for maintenance a gateway will keep working.
  • Location of telco demarc can be independant of telephony system. This can be in a different room, floor, building or even country. Just be careful of lag causing issues. But given the proper connections can allow moving the IP PBX system into the cloud while still supporting traditional telco trunks.
  • In a mixed trunk environment of VoIP and traditional telco connections the Gateway can abstract this so the IP PBX’s configuration is similar for all trunks.
  • Scaling up only requires adding a new gateway and a configuration change to the telephony system which minimizes downtime and risk.  Mainly due to avoiding the need to open the system to install new cards.

Considering the above it is hard to see the case for a PCI card, especially in an HA solution.  They may still have their place elsewhere but I’ll be recommending a VoIP Gateway going forward.

The Challenge of VoIP System Failures Not Addressed by Most High Availability Designs

Hardware or software can fail at anytime and induce a system failure. It is not possible to reduce such failures to nil. When VoIP based systems experience such failures, it results in the loss of on-going calls. High availability (HA) or redundant systems cannot address this unless they are capable of restoring an on-going call without either one of the end-points re-initiating the call. Most high availability system for Session Initiation Protocol (SIP) based VoIP calls and their redundancy setup, deploy an immediate replacement of the failed component/sub-system to allow continued use of the system. It is good enough for many situations but it might not be adequate for mission critical applications when the HA cannot not restore on-going calls.

Imagine a scenario where an outside caller initiates a call and when it hits the demarcation point of the contact center installation. This could be a premise based contact center or a Cloud set up offering virtual contact center services. When the call setup reaches the intended peer and conversation starts, it is possible that your system, either Cloud based or on-premise solutions, could experience a failure. Once the system detects a failure, its high availability and redundant setup will kick-in and the system will be ready for future calls but what happens to the on-going call? They just die. This is the normal operating mode of traditional high availability systems including most high availability solutions offered for Asterisk. This issue becomes more critical for large contact centers using automatic call distribution (ACD) with significant traffic at any given time.

With contact center ACD, the importance of going beyond the traditional high availability is extremely important. Having the capability to keep calls alive through call survival is critical. This will allow the user to continue the phone conversation without the need for re-initiating the call. It is a sophistication in offering redundancy that goes beyond recognizing the need to bring into action the replacement software and hardware components. It introduces intelligence required in preserving all the on-going calls essential for mission critical systems.

SIP Registration Timeout Settings for High Availability

In setting up a high available telephony system most worry about the back end and ensure it functions as they would expect and require.  However one highly visible user issue I have seen is a misconfiguration of the connected SIP phones in regards to the registration timeouts.  When these are very high on your SIP phone then it may not notice a service has moved (via IP/DNS/etc changes) due to a HA switchover and can potentially miss incoming calls until it does.  Typically an outgoing call attempt will work or at the very least cause a registration attempt to the new server the service has moved to.

For example take a look at Aastra, their defaults in a few models I’ve seen are at a half hour for a failed registration.

aastra-default-configuration-reg-failed-retry

If the failed registration timeout is half an hour and the phone attempts to re-register and fails your phone will show an error or unregistered for the next half hour.  This can happen in the cases where the registration comes in as a box is failing or a failover happened and the configuration is being written/updated due to the switchover process.  More reasonable set of values are shown in the following.

aastra-configuration-reg-failed-retry

In this one I’ve lowered the registration failed retry and also the timeout retry timers.  These will make the SIP phone resolve the registration issue quicker by retrying more often than the defaults.  They could be lower depending on the situation.

One precaution before everyone sets these very low.  These settings should be set appropriately when the SIP phone is off-site and there are protections, for example Fail2Ban, in place to block brute force attacks.  In these cases where the SIP phone is on an app on a mobile device this failed registration timeout should be set high enough to not trigger a lockout of a valid device.  If the devices are in-house or IPs can be whitelisted then the values can be lower without worry.