SUMMARY
On November 22, 2022 from 07:10 UTC to 12:42 UTC, Zendesk Support customers on Pod 17, using Side Conversations and its integrations such as Slack, experienced a delay in those messages being sent.
Timeline
13:44 UTC | 05:44 PT
We have been investigating reports since earlier UTC today, from Support customers on Pod 17, regarding deliverability issues with Side Conversations. The work to fully fix this is still ongoing and we appreciate your patience. More information in 1h.
14:34 UTC | 06:34 PT
We’re happy to confirm that the queue of unprocessed Side Conversation messages for customers on Pod 17 has cleared and emails are being successfully delivered without further issues. We thank you for your patience.
POST-MORTEM
Root Cause Analysis
This incident was caused by the service getting overloaded and degraded processing throughput, meaning it was receiving new outbound tasks faster than it could process through them, so the queue grew and cascaded to delayed sends.
Resolution
Several remediation steps were taken according to the playbooks, but the system was working normally and at capacity, so none of these was able to relieve the pressure. After the traffic input flood stopped, the system was able to catch up with the backlog and return the queue to an empty state.
Remediation Items
- Reinforce rate limits in the side-conversations API. [Done]
- Document the expected sequence of events for this type of situation. [To Do]
FOR MORE INFORMATION
For current system status information about your Zendesk, check out our system status page. The summary of our post-mortem investigation is usually posted here a few days after the incident has ended. If you have additional questions about this incident, please log a ticket with us via ZBot Messaging within the Widget.
1 Comments
Post-mortem published November 30, 2022.
Article is closed for comments.