SUMMARY
On February 1, 2023, between 17:40 UTC and 18:54 UTC, customers experienced server errors attempting to load Zendesk products, delays in sending and receiving messages in Sunshine Conversations, and delays in Chat & Messaging ticket creation.
Timeline
18:24 UTC | 10:24 PT
We are receiving reports of some server errors and delays in sending and receiving messages in Sunshine Conversations. Additional information will be posted shortly.
18:33 UTC | 10:33 PT
We have confirmed an issue causing server errors, delays in sending and receiving messages on Sunshine Conversations, and delays in Chat/Messaging ticket creation. Our team is investigating and we will post updates as soon as they're available.
18:48 UTC | 10:48 PT
Due to the errors affecting Sunshine Conversations the Zendesk ZBot Widget to contact Zendesk Support has been disabled, and customers will be automatically directed to the web form.
18:57 UTC | 10:57 PT
We are beginning to see some improvement in the issue causing server errors, delays in sending and receiving messages in Sunshine Conversations, and delays in Chat/Messaging ticket creation. Our team will monitor until full resolution.
19:19 UTC | 11:19 PT
We are happy to report that the issue causing server errors, delays in Sunshine Conversations, and delays in Messaging/Chat ticket creation has been resolved, and the Zendesk Z Bot Widget to contact Zendesk Support has been re-enabled.
POST-MORTEM
Root Cause Analysis
This incident was caused by insufficient memory provisioning on recently implemented Sunshine Conversations proxy servers as part of a change intended to increase Sunshine Conversations traffic efficiency and observability.
Resolution
To fix this issue, we rolled back the change.
Remediation Items
- Increase memory allocation for these Sunshine Conversations proxy servers.
- Modify deploy process to include extra review for similar traffic routing changes.
- Implement a more comprehensive runbook for similar resource allocation issues.
- Investigate opportunities for additional upstream alerting for failures.
FOR MORE INFORMATION
For current system status information about your Zendesk, check out our system status page. The summary of our post-mortem investigation is usually posted here a few days after the incident has ended. If you have additional questions about this incident, please log a ticket with us via ZBot Messaging within the Widget.
1 Comments
Postmortem published February 8, 2023.
Article is closed for comments.