SUMMARY
On February 19, 2024 from 12:10 and 12:35 UTC, a subset of customers across All Pods, mainly based in India, may have experienced connectivity delays and increased server errors while trying to load Zendesk products.
Timeline
13:18 UTC | 05:18 PT
We are aware of customer reports about regional connectivity issues for agents based in India. We can see that this has mostly recovered, but we’ll continue to monitor this closely until full resolution.
14:55 UTC | 06:55 PT
Customers across all pods, but only those based in the Chennai Region in India, may have encountered connectivity delays and increased 4xx/5xx HTTP Server Errors while trying to load Zendesk. The issue affected our CDN provider between 12:10 and 12:35 UTC today and is now considered to be fully resolved. We appreciate your patience.
POST-MORTEM
Root Cause Analysis
This incident was caused by regional Internet Service Provider (ISP) complications within India that disrupted the network routing capabilities of the CDN Provider's Chennai data center (colo). This disruption prevented successful communication with our infrastructure, leading to the errors experienced by users. A secondary factor that exacerbated the situation was the presence of temporary remediation measures from a previous incident, which were intended to prevent automatic failover by the Internet Resilience project for any CDN Provider colo in India. Consequently, this also disabled our ability to perform a manual failover during the outage.
Resolution
To fix this issue, our CDN Provider managed to reroute the affected traffic through the Chennai colo at approximately 12:33 UTC, which restored service by 12:35 UTC. Post-restoration, no further 522 HTTP errors were reported. Additionally, traffic was rerouted in the Mumbai colo as a precaution, but no adverse effects were observed from this action.
Remediation Items
- Adjust our monitoring systems to ensure that they can provide alerts even when specific remediation measures are in place, allowing for quicker detection and response.
FOR MORE INFORMATION
For current system status information about your Zendesk, check out our system status page. The summary of our post-mortem investigation is usually posted here a few days after the incident has ended. If you have additional questions about this incident, please log a ticket with us via ZBot Messaging within the Widget.