On August 13, 2019, from 00:52 UTC to 03:25 UTC, Apps making secure requests through the Zendesk Proxy may have timed out with an HTTP 524 error. Our team has rolled back an offending deploy to fix the issue. A post-mortem will be posted here when ready. Thanks for your patience while we resolved this issue.
On August 13, 2019 at 00:52 UTC a change in Zendesk Apps Proxy was rolled out to production. This change was to increase our observability over the health and behaviors of Zendesk Apps Proxy by adding additional instrumentation data to our monitoring system.
At 01:17 UTC, a customer reported that their users were unable to use their Zendesk App. The API calls made from the app via Zendesk Apps Proxy were failing.
Zendesk Apps team was notified and started investigation at 03:01 UTC. We rolled back the deployment at 03:25 UTC and the issue was resolved.
The root cause of the issue was subsequently identified. The new instrumentation library added extra headers in an unexpected format to app requests going through Zendesk Apps Proxy, which led to runtime exception specifically for app requests using secure app settings.
- Ensure that extensive testing and the full Quality Assurance process for Zendesk Apps Proxy is completed before introducing the change back to production
- Update documentation for testing and deployment process and improve tooling to enforce these processes as part of our engineering workflow
- Improve our monitoring system and alarm threshold to provide us early notification
- Ensure that change management process is in place to provide additional review on future deployments
FOR MORE INFORMATION
For current system status information about your Zendesk, check out our system status page. During an incident, you can also receive status updates by following @ZendeskOps on Twitter. The summary of our post-mortem investigation is usually posted here a few days after the incident has ended. If you have additional questions about this incident, please log a ticket with us.