11:27 UTC | 04:27 PT
Issues affecting performance on POD13 have been resolved. Thanks for your patience.
11:19 UTC | 04:19 PT
The issue on POD13 is now stable. Thank you for your patience, while we continue to progress on our remediation work.
10:59 UTC | 03:59 PT
Performance is stable in POD13, however, we are working to resolve a potential underlying performance issue. More info to follow.
10:39 UTC | 03:39 PT
We have identified some potential causes for the issues on POD13. More info to follow.
10:24 UTC | 03:24 PT
We are still investigating the performance issues on POD13. More info to follow.
10:08 UTC | 03:08 PT
We are currently experiencing issues with POD13. Updates to follow.
CPU utilization on the db slaves on a pod 13 cluster spiked due to an expensive organizations query rendering both slaves unavailable to serve apps until those queries were killed to restore normalcy. The issue occurred again 24 hours later. In order to prevent this from happening again in the future, we will add conditional rate limiting, improve internal db monitoring and alerting for the given pod, and improve organization query performance.
FOR MORE INFORMATION
For current system status information about your Zendesk, check out our system status page. During an incident, you can also receive status updates by following @ZendeskOps on Twitter. The summary of our post-mortem investigation is usually posted here a few days after the incident has ended. If you have additional questions about this incident, please log a ticket with us.