SUMMARY
On March 17, 2022 we fired a Service Incident because of an issue affecting Explore Enterprise customers, where they experienced incorrect data being shown for the feature release announced on March 15 [Announcing live agent status metrics and drill in]. The issue affected customers from 14:38 UTC March 15, 2022 to 14:15 UTC March 17, 2022.
Timeline
14:23 UTC | 07:23 PT
We have identified an issue affecting Explore Enterprise Agent Status Live Reporting data. We have reverted back to a previous version. We will share an update shortly.
15:09 UTC | 08:09 PT
We have confirmed the rollback of the Live Pre-Canned Dashboard was successful and we are no longer seeing any data issues in this previous version. Explore Enterprise customers can visit https://support.zendesk.com/hc/en-us/articles/4465527502746 where we’ll provide more updates on this release.
POST-MORTEM
Root Cause Analysis
We discovered an issue with one of our tools that connects to a database containing deleted/downgraded/suspended agents for the Support product which was wrongly counting these agents as Online. This, in turn, led to inaccurate reporting for the number of Online agents in the Live Explore Dashboard, as it was receiving data from this tool.
Resolution
To fix this issue, our Engineers rolled back the recent deploy that was the catalyst for this issue surfacing in Explore. Once the rollback was completed the issue was resolved.
Remediation Items
- Add automated testing to see the impact of agent entitlement changes. [To Do]
- Clean out any incorrect or problematic data from the databases storing these statuses. [In Progress]
FOR MORE INFORMATION
For current system status information about your Zendesk, check out our system status page. The summary of our post-mortem investigation is usually posted here a few days after the incident has ended. If you have additional questions about this incident, please log a ticket with us via ZBot Messaging within the Widget.