Quantitative evaluation is a vital part of an AI project, but there is little to no standardization within the topic today. In this post, we will look into the ways how Conversational AI can be measured, how the procedure varies between different roles & users and what to do with all of this information.
Why measure the performance of a virtual assistant (VA)?
Developing a solution based on Conversational AI can introduce a lot of uncertainty to the process when starting out. The early days of a VA can even be characterized as controlled chaos. One of the easiest ways to get your project onto solid ground right away is to keep track of it and see the progress over time. This data can then be analysed and applied to everyday work so that the evolving artificial organism called virtual assistant can adapt to changing surroundings.
The many sides of Conversational AI
Typically, there are many parties with unique interests and roles involved in a Conversational AI project. It is important to offer smooth experience and manage the expectations of as many parties as possibly, but failure in delivering to every side of the equation right away should not mean that the project is a failure. When starting out with a VA, it is important to clearly specify the main goal for it. If not sure, think about your business KPI’s, so that the initial work will be towards your main goals.
For example, head of customer service evaluates the AI by its capability to make customers happy while decreasing the workload on her/his customer support team. That support team consisting of customer support agents see the value in their artificial colleague, when it enables them to save time and focus on more meaningful tasks.
Finance department reasons through numerical data – how much money went into the project and how much money is saved or made from it. In addition to that, there can be salespeople involved. They might measure the success by the number of leads and closed deals.
And on top of it, there are the main users of the VA – customers. Customers evaluate the bot by their experience and the overall helpfulness. This is ultimately one of the harshest but useful metrics of all of them.
What should you measure?
There is by no means “one size fits for all” solution for every VA project. However, some metrics are more commonly tracked than others. Getting started with them will build your virtual assistant project a solid analytics foundation.
Nature Language Understanding (NLU) rate – Measures the algorithm’s ability to match user messages with correct intent. When starting out, the AI does not know much about your business, so keeping an eye on the number right from the start can show how the AI adjusts with the topics. After some time, one can expect an approximate NLU rate of 80%.
Autonomous solve rate – Measures the VA’s ability to solve user enquiries on its own and asks the users to confirm if they found it useful. This metric illustrates the virtual assistant ability to replace human workforce. As previously mentioned, the input of customers is often one of the harshest but most useful metrics for the virtual assistant. Getting the autonomous solve rate up means that you are doing good job in improving the user experience of your most valuable asset – the customers. More sophisticated virtual assistants with numerous options can autonomously solve about 20% of all user enquiries, but if we introduce less variables in the mix (meaning that there are less but more focused options for the user), the solve rate can even rose higher.
User satisfaction with the VA – A different way of finding out if the end user is happy with the solution or there might be something that needs to be improved upon. When the conversation with VA is over, the user is asked to leave a rating and a comment. The number of this metric might vary a lot as the feedback of the customers might be fuelled more by the momentary emotions and less by the actual arguments. In general, if the solution is solid, 75% of users rate their chat experience 5 out of 5 stars. A key in finding improvements in a case like this is to go through the rest of users, whose feedback was not as high.
Moving on to more sales-oriented metrics:
Number of leads generated through VA – Number of leads generated through VA displays how effective are the VA’s conversational flows that try to capture the contact information of the user. For example, one could place an introductory message “let’s get to know each other” at the start of a conversational flow. Sometimes, it is a good idea to use a call to action in the outro message, just before the user closes the window. Furthermore, one could program the VA to trigger call to action message when the user has navigated to a certain URL or a section of the website.
Revenue generated through the use of VA’s – VA’s can be programmed to trigger personalized sales messages that are based on the recipient’s profile or behavior. This can be done through authentication and can be a significant boost to the total revenue as there are no extra staff costs related to the process.
Reduced churn – According to Forrester, it costs 5 times more to acquire new customer than it does to keep an existing one. Conversational AI can help to reduce churn by treating the leaving customer right. For example, one could ease the process of leaving by letting the AI hand out information about the process. In addition to that, the VA can message the customers with a promotion that might change their mind. Furthermore, the VA can ask the customer to describe why they feel like they no longer need the services. Approaching the matter in almost like a human-to-human dialogue form (that can be further enhanced by real customer support agents input) creates a more personalized and intimate experience. And even if the customer is determined to leave, the feedback from the conversation provides significant insight on why the churn occurs in the first place.
When there is a need for even more closer look, all of the previously mentioned metrics can be analyzed on more narrow, topic-specific level. We believe in modularity and flexibility. Equipping the user with variety of analytics tools ensures that even the most demanding analysts can access the information in the way that they find most fitting.