SUMMARY
On July 30, 2021, from 08:32 UTC to 09:39 UTC, Asia-Pacific region customers across multiple Pods experienced connectivity issues & errors when loading the Zendesk Platform.
Timeline
09:17 UTC | 02:17 PT
Our teams are investigating connection issues impacting some customers in the APAC region. We will provide further information & additional scope shortly.
09:39 UTC | 02:39 PT
Our CDN provider experienced issues for the APAC region causing time out errors and platform latency and accounts being unavailable for some of our customers. The issue appears to be stable as they have partially re-routed the connections. Our engineering team continues to monitor.
10:32 UTC | 03:32 PT
We are happy to report that the network issues impacting our CDN provider causing slowness or our software not loading for some of our APAC based customers have now been resolved. Thank you for your patience!
POST-MORTEM
Root Cause Analysis
This incident was caused by our CDN provider enabling a router in Singapore with an erroneous network change and putting it into production.
Resolution
To fix this issue, the CDN provider disabled the capacity and the router that contained the network configuration that was causing traffic to fail.
Remediation Items
To be done by the CDN provider:
- Improve default origin routing to ensure that paths that have high loss are not preferred by their open-source container-native workflow engine.
- Improve static route generation on routers to not propagate globally.
FOR MORE INFORMATION
For current system status information about your Zendesk, check out our system status page. During an incident, you can also receive status updates by following @ZendeskOps on Twitter. The summary of our post-mortem investigation is usually posted here a few days after the incident has ended. If you have additional questions about this incident, please log a ticket with us.