SUMMARY
On January 8, 2024 from 17:38 to 17:56 UTC, Support customers on Pods 13 and 20 may have experienced latency, delays, and green screen errors when attempting to load Support and/or tickets.
Timeline
18:07 UTC | 10:07 PT
We are investigating reports of slowness and delays in Pods 13 and 20. We will provide additional updates soon.
18:19 UTC | 10:19 PT
We have confirmed an issue causing latency, delays, errors, and in some cases inability to log in for customers on Pods 13 and 20, and our team is investigating. We will continue to provide updates as we learn more.
18:29 UTC | 10:29 PT
We are working with our content delivery network (CDN) provider to mitigate the latency, delays, and errors seen by customers on Pods 13 and 20. We will provide any new information as the investigation progresses.
18:44 UTC | 10:44 PT
We have re-routed much of the traffic to Pods 13 and 20, and the rate of errors has reduced significantly. We have mitigated the immediate impact and are continuing work with our CDN provider to begin recovery. Please let us know if you continue to experience any issues.
19:22 UTC | 11:22 PT
Our CDN provider has implemented a fix for the issue causing latency, delays, and errors for customers on Pods 13 and 20, and we are monitoring the results. Please let us know if you experience any resurgence of delays or related issues.
19:40 UTC | 11:40 PT
The fix implemented by our CDN provider has proven to be effective and the issue causing latency, delays, and errors for customers on Pods 13 and 20 has been resolved. Thank you for your patience during our investigation.
POST-MORTEM
Root Cause Analysis
This incident was caused by our content delivery network (CDN) provider experiencing network congestion across multiple locations in the US.
Resolution
To fix this issue, we failed over to a backup until our CDN provider was able to implement a fix for the congestion.
Remediation Items
- Investigate automatic failover response mechanism to ensure that it activates in all geo-locations experiencing similar issues.
- Explore potential adjustments to error thresholds for automatic failover response.
FOR MORE INFORMATION
For current system status information about your Zendesk, check out our system status page. The summary of our post-mortem investigation is usually posted here a few days after the incident has ended. If you have additional questions about this incident, please log a ticket with us via ZBot Messaging within the Widget.