Summary
On October 24, 2025, 15:12 UTC to October 27, 2025, 14:34 UTC, some customers in the EU and LATAM regions experienced difficulties accessing and signing into the AI Agents dashboard and admin pages. Users faced delays, error messages, and occasional permission issues that impacted their ability to use the service smoothly.
Timeline
October 24, 2025 15:39 UTC | October 24, 2025 08:39 AM PT
We have received reports of customers encountering 5XX errors when accessing the AI Agents core services. Our engineers have observed some recovery but continue to actively investigate the root cause. We apologize for the inconvenience and will provide updates as they become available.
October 24, 2025 14:54 UTC | October 24, 2025 09:54 AM PT
We are observing significant improvements in the 5XX errors affecting the AI Agents core services. Our team continues to monitor system activity to ensure full stability. We appreciate your patience as we work to resolve this issue promptly and will provide updates as they become available.
October 24, 2025 17:52 UTC | October 24, 2025 10:54 AM PT
We have continued to observe sustained stabilization in 5XX errors affecting the AI agents core services. If you are seeing any additional issues please reach out so we can investigate, and thank you for your patience during this disruption.
Root Cause Analysis
This incident was caused by an issue introduced during a recent update. A part of the system was triggered multiple times unnecessarily, which put extra strain on the servers and caused the AI Agents dashboard and admin pages to become slow and unresponsive. As a result, users experienced errors and delays.
Resolution
To fix this issue, the team temporarily increased system capacity to improve performance while investigating the cause. Once they identified the source of the problem, they reversed the recent change that led to it, which quickly eased the strain on the system and restored normal service.
Remediation Items
Put in place tools to quickly spot unusual increases in system usage and slow performance.
Set up notifications to alert the team when resource limits are being reached.
Make a permanent fix to prevent repeated actions that caused the problem.
Add tools to monitor and track messages and system activity for better oversight.
Review and improve how updates are scheduled and monitored, including considering different regions and customer groups.
Use a standard process for reviewing and approving changes that includes assessing risks and planning for quick fixes if needed.
Introduce additional alerts to catch rises in errors early and prevent similar issues.
FOR MORE INFORMATION
For current system status information about Zendesk and specific impacts to your account, visit our system status page. You can follow this article to be notified when our post-mortem report is published. If you have additional questions about this incident, contact Zendesk customer support.