Summary

On July 21, 2025, from 12:10 UTC to 13:48 UTC, customers in Pod 18 experienced notable disruptions to their email services, leading to delays and interruptions in both inbound and outbound email processing.

Timeline

July 21, 2025 16:20 UTC | 09:20 AM PT

We are pleased to inform you that the backlog of delayed emails have been processed. Thank you for your patience and understanding during this incident.

Root Cause Analysis

The incident occurred when a trial account exceeded user limits, creating an unusually high volume of emails that overwhelmed the email servers and temporarily halted processing. A delay in alert notifications also contributed to a slower response.

Resolution

To fix this issue, the customer’s trial account was immediately suspended to stop the creation of new emails. The technical team then increased the memory capacity on the email processing servers to handle the backup queue, allowing the system to clear the email backlog and resume normal services.

Remediation Items

  1. Implement restrictions to prevent trial customers from exceeding user seat limits via bulk user imports.

  2. Introduce limits on email queue size, particularly on outbound emails, to prevent any one cluster from being overwhelmed.

  3. Review and improve alerting configurations to ensure critical notifications reach the on-call team without fail.

  4. Rebalance memory allocations for email processing servers to accommodate sudden spikes in demand.

  5. Investigate and develop auto-scaling capabilities for the email service infrastructure to dynamically adjust resources during high load.

FOR MORE INFORMATION

For current system status information about Zendesk and specific impacts to your account, visit our system status page. You can follow this article to be notified when our post-mortem report is published. If you have additional questions about this incident, contact Zendesk customer support.

Powered by Zendesk