SUMMARY
On May 26th, 2021 from 9:13 UTC to 9:40 UTC, customers using Sell experienced 503 console errors, and UI messages such as “Sorry. Something went wrong.” and “This page isn’t working” when trying to login.
Timeline
10:08 UTC | 03:08 PT
We experienced 500 error rates for SELL customers when trying to access the platform, from 09:13 - 09:40 UTC today. A fix has been deployed and service is back to normal.
POST-MORTEM
Root Cause Analysis
This incident was caused by a deployment that did not update local network data points that were not included in the library routing configuration. This one particular local network to private traffic was not added or updated in every deployment, therefore new workers of applications were not added and replaced.
Resolution
The local network data points were correctly included in the library local network list to fix this issue, therefore added and replaced, prompting the customers to stop seeing errors.
Remediation Items
- Define error budget and SLO for Sell Routing.
- Simplify and remove manual steps to include the information in the library local network list.
FOR MORE INFORMATION
For current system status information about your Zendesk, check out our system status page. During an incident, you can also receive status updates by following @ZendeskOps on Twitter. The summary of our post-mortem investigation is usually posted here a few days after the incident has ended. If you have additional questions about this incident, please log a ticket with us.