Trust, But Verify: Simple Steps to Finding Your IVR Issues

The call came first thing in the morning.  Agents were logged in, callers could dial in, but callers were not getting connected to agents.  A foundational tenet of skills-based call routing is that calls get routed to agents, so the team leapt into action.  As support techs logged in and began gathering PCAPs (Packet Captures) and poring over logs to discover the cause, we also began test calls into their IVR (Interactive Voice Response).

One diagnostic technique that can occasionally clear up issues with an IVR, queue or agents on the floor is the test call.  I am occasionally surprised when a single test call reveals the smoking gun, and frequently surprised at how many call center managers and supervisors are so resistant to picking up an extension and dialing a number on their own system.

Here are a few cases I’ve seen that were quickly resolved with a test call:

* A newly launched center called to report that agents were receiving “ghost calls.”  That’s an annoyingly non-specific description that can cover a number of cases.  In this case, the agent was receiving a call, but could not hear the client.  They would disposition the call and move on, usually receiving a few ghost calls in a row.  With the administrators all conferenced in, a test call was made.  We could hear the agent, but the agent could not hear us.  When somebody  was sent to the floor, it turned out that when executing the transfer function, agents would occasionally accidentally turn down the volume of their headsets.  Agents received instruction on this, and the problem did not recur.

* A new IVR was not setting the correct information for agent retrieval when the agent received the call.  Senior users of the system pored over the IVR for far too long trying to discover the error in the dialplan.  When we called into the assigned DID (Direct Inward Dial), we found the IVR they thought they were using was not the actual IVR being hit.  It turned out that a last minute change had occurred, and the DID was now going to a different place.  Once that was discovered, the logic error in the IVR was quickly discovered and rectified.

* A very complex IVR had been copied and modified for a similar purpose.  After a few hours in production, the center noticed they weren’t getting any calls into a particular queue.  When we were asked to look into it, a single test call revealed that calls meeting the most common criteria were being directed to another queue entirely.  We discovered this by asking the agent what queue we had been routed to.  The error was then quickly rectified.

* A center complained that they were not getting calls into a particular queue, even though that was one of their busiest.  We called the toll-free number that was assigned, and got another call center.  Their client controlled the DIDs at the telco level, and had rerouted some of them due to issues upstream from our client call center.  Their client hadn’t bothered to notify them, but when the telco issues were resolved, calls were directed back at our client.

Other issues I have seen at various places which are easily testable with a call are:

* Poor music on hold quality.  Sometimes it sounds like you’re listening to static.
* Toll free or other numbers not coming in on the expected DID.
* Numbers coming in on the expected DID, but prepended with a + or 1

Quite honestly, there are going to be any number of problems that can be diagnosed with a test call, and it’s a mystery to me why more call centers don’t do semi-regular call tests.  You can uncover issues and annoyances that may be affecting your SLA (Service Level Agreement) and abandon rates.

Back to the case that I opened with, it was becoming clear that the number we were dialing did go to an IVR with similar periodic messages and music on hold, but we were not seeing our dials in the captured packets or in the logs.  This could have been indicative of a problem with logging itself, but the smoking gun was when Asterisk was restarted on the system and my 15 minute-long call was not disconnected.  Relating that information to them allowed them to determine an old clone had become active and was somehow sitting on the IP address the inbound trunk was connecting to.  With agents on one system, and calls coming into another that the Q-Suite was unaware of, there was no chance of getting calls in.  The issue was then quickly rectified directly by the client’s IT department, and they were back in business before their peak call times.