On January 8, 20202 from 10:23 UTC to 10:26 UTC customers on Zendesk Support and Talk on Pod 17 experienced dropped calls on Talk and degraded performance or the inability to load Support.
11:20 UTC | 03:20 PT
We have concluded our investigation into the service degradation that impacted some Support & Talk customers on Pod 17. Service is operating normally.
10:54 UTC | 02:54 PT
Our teams are investigating a service degradation that occurred in Pod 17 between 10:22 and 10:26 UTC. We will provide an update shortly.
Root Cause Analysis
This incident was caused by database restarts during an AWS service issue between 10:21 and 10:25 UTC on January 8th. A network fiber path connecting two Availability Zones in the EU-WEST-1 Region saw elevated error rates and response latencies when traffic was prematurely moved to an alternate path, resulting in traffic congestion across the alternate path and subsequent packet loss.
To fix this issue, the AWS automated procedure completed provisioning bandwidth on the alternate path and traffic was completely moved to all devices in the alternate path.
We continue to investigate potential improvements to handle increased packet loss within our infrastructure.
FOR MORE INFORMATION
For current system status information about your Zendesk, check out our system status page. During an incident, you can also receive status updates by following @ZendeskOps on Twitter. The summary of our post-mortem investigation is usually posted here a few days after the incident has ended. If you have additional questions about this incident, please log a ticket with us.