Summary
On February 18, 2022 from 18:13 UTC to 18:51 UTC, Zendesk Support customers on Pod 13 experienced delays in tickets being created and comments being updated via the email channel.
Timeline
18:43 UTC | 10:43 PT
We are investigating reports of email ticket creation delays on Pod 13. More updates to follow.
18:58 UTC | 10:58 PT
We are happy to report that we have recovered from the email ticket creation delays affecting Pod 13 and any backlogged emails have been processed.
Root Cause Analysis
This incident was caused by a schema change we rolled out to our production environment. The change resulted in a race condition in our email processing service leading to the failure to process inbound emails into tickets and associated ticket comments. A contributing factor to the impact on Pod 13 was the unsuccessful automatic rollback of the deployment which required manual intervention by our engineering team.
Resolution
To fix this issue, our engineering team manually rolled back the schema change. The email processing service started processing emails and creating tickets when this roll back was completed.
Remediation Items
- Investigate why the defect was not manifested (no exception) during testing or in earlier stages of deployment [DONE]
- Investigate why automated rollback failed on Pod 13 [IN PROGRESS]
- Investigate lowering the time taken for a manual rollback [TO BE SCHEDULED]
- Investigate lowering the time taken for automated detection of such an issue in a pod [TO BE SCHEDULED]
- Enforce schema consistency across testing and production environments [TO BE SCHEDULED]
FOR MORE INFORMATION
For current system status information about your Zendesk, check out our system status page. During an incident, you can also receive status updates by following @ZendeskOps on Twitter. The summary of our post-mortem investigation is usually posted here a few days after the incident has ended. If you have additional questions about this incident, please log a ticket with us.