SUMMARY
On April 26, 2023 from 12:38 UTC to 15:15 UTC, customers across Pods 19, 20, 27 and 29 were experiencing intermittent 5xx errors across multiple products.
Timeline
14:55 UTC | 07:55 PT
We are currently investigating reports of 500 errors for customers across multiple products in Pods 19, 20, 27, 29. We will follow up with another update once we learn more.
15:09 UTC | 08:09 PT
We have confirmed an issue causing 5xx level errors across all products in Pods 19, 20, 27, and 29. We are investigating and will update you again shortly.
15:34 UTC | 08:34 PT
We are happy to report that the issues affecting Pods 19, 20, 27, and 29 are now resolved. Thank you for your patience during our investigation.
POST-MORTEM
Root Cause Analysis
This incident was caused by a deployed change which caused unexpected errors and lead customers to experience a slightly increased error rate and intermittent loss of functionality across Zendesk features during the incident time.
Resolution
To fix this issue, we reverted the deploy and traffic returned to normal levels and restored normal functionality
Remediation Items
- Investigate and improve monitoring method for 5xx errors
- Improve detection time by cross-correlating low-level errors affecting multiple services that may indicate broader platform-level issues.
FOR MORE INFORMATION
For current system status information about your Zendesk, check out our system status page. The summary of our post-mortem investigation is usually posted here a few days after the incident has ended. If you have additional questions about this incident, please log a ticket with us via ZBot Messaging within the Widget.