SUMMARY
On March 10, 2022 from 13:11 UTC to 13:52 UTC, Explore customers based in US hosted Pods experienced errors when trying to access and load data in their dashboards and queries, being unable to get any results during the said period.
Timeline
13:34 UTC | 05:34 PT
We are currently going through a service interruption affecting Explore customers in the US region. Investigation is underway.
13:47 UTC | 05:47 PT
We continue working to fix the issue causing the service interruption affecting Explore customers in US Pod based regions. We’ll keep you updated as we find out more.
14:11 UTC | 06:11 PT
We have applied a fix and we’re seeing improvements in query data load. Explore customers in US Pods are expected to start seeing processed data. Please refresh your browser. We appreciate your patience while we continue working on this to full resolution.
14:38 UTC | 06:38 PT
We’re happy to report that the issues impacting Explore accounts based in US Pods have been resolved. Thank you for staying in touch!
POST-MORTEM
Root Cause Analysis
This incident was caused by the Explore engine container failing to start when deployed to production in US data centers. That caused customers to be unable to access Explore in the US region. This happened due to a change in the backend where some updates had been done but not reflected in the production side.
Resolution
To fix this issue, our engineers reverted the change.
Remediations
1. Roll back change
2. Add additional alerting
3. Remove dependency that caused issue
FOR MORE INFORMATION
For current system status information about your Zendesk, check out our system status page. The summary of our post-mortem investigation is usually posted here a few days after the incident has ended. If you have additional questions about this incident, please log a ticket with us via ZBot Messaging within the Widget.