The confidence determines how certain the virtual agent is that the expression received matches to an Intent.
You can see this in your Conversation Logs when you click on any message as to what it was recognized as and the percentage of confidence it had that it was that Intent.
You have the possibility to choose what you would like the confidence threshold to be - how sure does the Virtual Agent need to be in order to send an Intent reply?
We set the confidence threshold for the bot to 60, and we find that most of our users have the threshold set somewhere between 50% and 70% as the sweet spot that provides the most value when training the bot. But you might be asking yourself... How do you know what is the best confidence threshold for you?
In order to decide this, you need to know what happens when you lower or higher the threshold. What happens when the bot receives an expression on either side of the threshold?
In this article we will cover:
Setting the Confidence Threshold
In the Beginning
One approach we can recommend is that at the beginning, start with a higher confidence threshold, once you feel confident in your Intent Replies, you have added more expressions and training to your Virtual Agent, and have factored in the feedback from your users, you can experiment by lowering it and see if it impacts other metrics.
Setting the Confidence Threshold
The confidence threshold of your bot can be found in Settings > General > Confidence Threshold
Note - based on the confidence of an expression, you can send different replies using conditional blocks with the confidence_score parameter.
Confidence Scenarios
When a message comes in the AI kicks into gear and will compare the content of the message against all of the Intents you have created and its understanding of them, to see whether it can find a match. Based on this analysis it will do some quick maths to determine what level of confidence it feels it belongs to one of those Intents, as to how closely it matches this representation the AI has built. Thresholds are from 0-100 and if you have the confidence threshold at the default of 60, everything at 60 and above will trigger the Intent it was matched to along with it's subsequent actions and replies.
- If it is above the threshold, that Intent is triggered which could be correct or incorrect - the lower you make the threshold, there is a higher likelihood that a wrong intent is triggered.
- If it is below the threshold a Default Reply will be sent.
- For ticket automation, if it's below the threshold - no reply is sent unless you have configured it otherwise.
For chat automation it could be that then the customer is escalated, depending on how you have designed your default reply.
The higher the confidence threshold, the more accurate the bot would be, but then more default replies would be sent. Therefore you need to consider 5 things:
- The benefits you are looking to achieve with the Virtual Agent?
- What does the Virtual Agent being wrong cost you as a business and your customers?
- What does the Virtual Agent sending the Default Reply cost you as a business and your customers?
- What does doing nothing or escalating cost you as a business and your customers?
- The Conversation Design on your Default Reply
Considerations
To determine the confidence threshold, you can run a full analysis, however, if you are looking for how to you can do this in a simple way we can ask you an either-or question.
In 100 messages, which is better? | |
50 potentially incorrect answers + 50 correct answers | 100 Default Replies / No action taken |
40 potentially incorrect answers + 60 correct answers | 100 Default Replies / No action taken |
30 potentially incorrect answers + 70 correct answers | 100 Default Replies / No action taken |
20 potentially incorrect answers + 80 correct answers | 100 Default Replies / No action taken |
Now, the answer will depend on how you have built your Intent Replies.
Are the conversations designed flexibly?
Do they have the option, especially on similar intents such as Return Process vs Refund Inquiry? They share a lot of common words around returns, so the likelihood for confusion is higher, but if on the first message you can get them back on track the wrong intent being triggered has a lower negative impact.
How good is the training?
Do you have a lot of confusion between Intents? By analyzing the Confusion Matrix you can try to reduce confusion which will impact the confidence positively.