On November 16, 2020 from 10:04 UTC to 15:30 UTC, some Zendesk Support customers on Pod 17 may have experienced server errors when using Side Conversations.
10:26 UTC | 02:26 PT
We are currently aware of an issue with some customers on Pod 17 receiving a server error when using Side Conversations, we are working to understand the cause of this and will keep you updated as we find out more.
10:59 UTC | 02:59 PT
We’re continuing to investigate the cause of the server error when using Side Conversations, we will continue to provide updates.
12:08 UTC | 04:08 PT
Our team are still working to resolve the error for customers on Pod 17 when using Side Conversations, we will keep you updated as we find out more.
13:10 UTC | 05:10 PT
We're continuing to investigate the server error for customers on Pod 17 when using Side Conversations, we will share an update once we have additional information.
15:48 UTC | 07:48 PT
Our team continue to work to resolve the Side Conversations issue for customers on Pod 17, we will keep providing updates as soon as we have further information.
17:18 UTC | 09:18 PT
We’re happy to report the issues affecting Side Conversations for customers on Pod 17 have now been resolved. Thank you for your patience and please let us know if you see any further issues. Post-mortem to follow: https://zdsk.co/3f2DnqX
Root Cause Analysis
This incident was caused by a caching mechanism failure in a notification service related to the Side Conversations feature. An expensive user notification event was not dealt with through a caching mechanism resulting in a high number of queries overwhelming a datastore cluster.
To fix this issue, our team assisted with easing the load on the cluster by temporarily disabling some low priority jobs to allow the datastore cluster to quickly resume processing notification events.
- Configure caching for user notifications in our Side Conversations notification service [Completed].
FOR MORE INFORMATION
For current system status information about your Zendesk, check out our system status page. During an incident, you can also receive status updates by following @ZendeskOps on Twitter. The summary of our post-mortem investigation is usually posted here a few days after the incident has ended. If you have additional questions about this incident, please log a ticket with us.