Zendesk provides business critical functions for our customers. When the service of these products is interrupted—causing disruptions known as service incidents—Zendesk takes action and initiates an investigation and remediation process.
This process includes detection, reporting, analysis, and mitigation of incidents, as well as documentation and remediation steps to ensure that we learn from them. Zendesk seeks to restore the full function of services quickly and thoroughly to provide a trusted and reliable experience for customers.
The service incident management process has four main goals:
- Restore normal operations of Zendesk services as quickly as possible
- Provide meaningful information to customers during an incident to mitigate impact where possible and provide updates on remediation status
- Perform detailed root cause analysis and identify permanent fixes once service is restored and share this analysis with customers to maintain trust in Zendesk services
- Share lessons learned across engineering teams and track incident causes and remediations
This guide describes the Zendesk incident management process in the following parts: