When adding capacity to an Asterisk-based ACD (Automatic Call Distributor) system, the desire is to increase the throughput of the system in a linear fashion. Choosing call center software that allows the addition of servers to increase capacity is an essential step. However, one must take certain steps to ensure that individual servers don’t become a choke point for performance. This is where load balancing comes in.
When scaling call center software, there are a few limitations that can stop you in your tracks if you are not watching for them. Sometimes the effect is confined to a local system. Sometimes the effect is in the design of the entire installation.
Some examples of local system issues are:
- Limitations on I/O operations – this can be due to limitations on network traffic, hard disk read/write/seek speeds, or the sheer amount of data that can be moved per second
- CPU usage – on Asterisk this is usually due to transcoding, mixing audio, etc.. Other times, especially in virtual environments, the CPU may have more work to do due to the lack of hardware support for direct data handling (as there is no hardware).
- Recordings – here the limiting factor is usually disk speed, although the speed at which Asterisk can mix audio and convert it to another audio format for recording.
- Virtualization software – the type of virtualization being done can introduce chokepoints for performance. For instance, Xen (which is used by Amazon) had a history of not allowing network interrupts to be shared across cores of a CPU. This potentially pushes too much work onto one core and limiting the amount of work that can be done while leaving processing capacity sitting idle. Other types of virtualization don’t take full advantage of hardware capacity. There are several other ways performance can be limited, but that should probably be its own series of posts.
No matter how well tuned a single server is, it will still have capacity issues. At some point, one of the above categories is likely to get pushed to the maximum if servers aren’t added.
You can have your servers all well-tuned and ready for peak performance, and still not receive full performance of your system. A common cause of this is a design where a certain server becomes a limiting factor in capacity. Some examples we’ve seen of this are:
- Too many agents assigned to an individual server. That often results in too high a percentage of calls being routed through that particular server.
- A high capacity trunk being set to register to a single server. If all calls must go through a single server, then the capacity of that server is your choke point.
- Too many users overall. For web servers, it’s sometimes the case that administrative users are pulling recordings and reports from a single server that is also shared by call center agents. They can choke off system resources quite quickly in some cases if they are downloading multiple recordings at a time.
For these reasons, we encourage the even distribution of workload through load balancing. For calls, the Q-Suite offers a High Availability SIP proxy that allows the contact center to specify which Asterisk servers calls can be load balanced over. When calls come in, a round-robin method is used to deliver calls in a balanced manner. If a particular server is expected to have a higher workload due to other factors, such as being shared among multiple services or being used as a registration server, it can be excluded from load balancing. Q-Suite also offers load balancing for agent web services.
In general, load balancing can prevent issues with system design, as some of those decisions get made as calls come in. Load balancing can also be used to cover shortcomings in local system capacity until they can be resolved, by spreading the load over multiple similar systems. In this way, additional servers can be provisioned in a cloud contact center environment to allow production to continue while avoiding certain thresholds on single systems. In either case, it allows you to reach your full potential.