SUMMARY
On March 11 from 14:04 UTC to 15:32 UTC SunCo customers across all Pods experienced latency on Sunshine Conversations and Messaging.
Timeline
15:18 UTC | 08:18 PT
We have identified an issue that is causing increased latency on Sunshine Conversations and Messaging for some customers globally. Our teams are investigating and we will provide further updates when we have new information to share.
16:18 UTC | 09:18 PT
Our engineering team is still investigating the issue causing latency on Sunshine Conversations and Messaging for some customers globally. We will continue to post updates as the investigation progresses.
16:25 UTC | 09:25 PT
We are beginning to see some improvement and stability from the issue causing latency on Sunshine Conversations and Messaging for some customers globally. Our team will continue to monitor the situation to ensure full recovery.
17:47 UTC | 10:47 PT
We are happy to report that the issue causing latency on Sunshine Conversations and Messaging for some customers globally has been resolved. Thank you for your patience during our investigation.
POST-MORTEM
Root Cause Analysis
Unintended connection pooling behaviour caused elevated connection saturation which lead to latency
Resolution
In order to reconnect the database, the workloads were recycled, stabilizing the system. This in turn cause the error rates to drop quickly and messages to be processed
Remediation Items
- Set database routing to directly connect to shard [Done]
- Improve pool management [In Progress]
FOR MORE INFORMATION
For current system status information about your Zendesk, check out our system status page. The summary of our post-mortem investigation is usually posted here a few days after the incident has ended. If you have additional questions about this incident, please log a ticket with us via ZBot Messaging within the Widget.