SUMMARY
On August 11, 2022 from 10:43 UTC to 12:21 UTC, a subset of customers serving chats in Agent Workspace were presented with “Chat disconnected” and “Chat Server Error” growl notifications in the UI while trying to go online or serve incoming chats.
Timeline
11:45 UTC | 04:45 PT
We are investigating reports of Chat disconnecting and server error messages to a subset of customers in South America (more specifically, Brazil). More information to follow.
12:03 UTC | 05:03 PT
We continue to investigate the issue regarding Chat disconnecting and server error messages to a subset of Agent Workspace customers in South America. Please make sure you refresh your browser and are connected to a VPN if needed. We appreciate your patience.
12:22 UTC | 05:22 PT
We have received a few reports from our Agent Workspace customers across multiple Pods, and not only in South America, confirming they no longer see Chat disconnecting and server error messages. More information in the next 30 min or when have additional details.
12:51 UTC | 05:51 PT
Our team has seen the errors subsiding on the backend regarding Chat disconnections for Agent Workspace customers across multiple Pods. Please make sure you update and refresh your browser and let us know if you still have further issues.
13:44 UTC | 06:44 PT
We are happy to confirm the issues with Chat disconnection and the server error messages across multiple Pods have been resolved. We appreciate your patience while we worked through this.
POST-MORTEM
Root Cause Analysis
This incident seems to have been caused by what looks to be a region-specific internet service provider (ISP) outage, as evidenced by network logs and graphs displaying timeouts and disconnections mostly for customers in Brazil. No issue could be found with Zendesk or its vendors and it was self resolved.
Resolution
No action was taken by Zendesk to resolve this issue.
Remediation Items
- Ensure monitoring alerts are set up to notify the responsible team of a surge in disconnections. [To Do]
- Adjust the wording for the error message in the UI to better reflect the cause of the error. [To Do]
FOR MORE INFORMATION
For current system status information about your Zendesk, check out our system status page. The summary of our post-mortem investigation is usually posted here a few days after the incident has ended. If you have additional questions about this incident, please log a ticket with us via ZBot Messaging within the Widget.