Summary
On December 14, 2022 from 16:03 UTC to 17:53 UTC, Zendesk Talk customers on Pods 19 and 23 encountered Talk application errors in Support as a result of call failures.
Timeline
17:30 UTC | 09:30 PT
We are investigating reports of application errors using Talk on Pods 19 and 23. We will provide additional information shortly.
17:47 UTC | 09:47 PT
We have confirmed an issue causing application errors for customers using Talk on Pods 19 and 23. More updates to follow.
18:31 UTC | 10:31 PT
We are happy to report that the issue causing Talk application errors has been resolved. Thank you for your patience during our investigation.
Root Cause Analysis
This incident was caused by a spike in outbound call traffic in our service provider’s infrastructure, which triggered a failure in autoscaling on their shared call routing infrastructure.
Resolution
To fix this issue, our service provider manually increased capacity by scaling up the affected infrastructure.
Remediation Items
- [Service provider] Increased capacity headroom and reconfigured autoscaling features.
- [Service provider] Monitoring and alerting improvements scheduled.
- [Zendesk] Reviewed time taken to alert service provider.
- [Zendesk] Update runbooks to improve partner issue investigation.
FOR MORE INFORMATION
For current system status information about your Zendesk, check out our system status page. The summary of our post-mortem investigation is usually posted here a few days after the incident has ended. If you have additional questions about this incident, please log a ticket with us via ZBot Messaging within the Widget.
1 Comments
Post-mortem published on January 17, 2023.
Article is closed for comments.