SUMMARY
On September 13, 2022 we investigated continuing reports of missing emails, related to the Service Incident on September 6, 2022. During the investigation, our engineers found an outstanding issue which caused a subset of additional emails to fail to create or update tickets from September 6 16:47 UTC to September 13 16:12 UTC. Once the issue was identified a fix was applied and the remaining unprocessed emails were processed during a 30 minute time frame between Sep 13 16:00-16:30 UTC.
September 14 16:59 UTC | 09:59 PT
One final update. The unprocessed emails from between September 6 16:47 UTC to September 13 16:12 UTC were processed during a 30 minute timeframe between Sep 13 16:00-16:30 UTC yesterday.
If you would like to find all email tickets that were created and updated during that timeframe, you can use the following search queries. Please note that this will also include tickets that were correctly processed at the correct time without delay during that 30 minute time frame.
created>2022-09-13T16:00:00Z created<2022-09-13T16:30:00Z via:mail
updated>2022-09-13T16:00:00Z updated<2022-09-13T16:30:00Z via:mail
21:41 UTC | 14:41 PT
Thank you for your patience as we took a closer look into reports of missing emails persisting after the incident on September 6, 2022. Our engineers found an outstanding issue which caused a subset of additional emails to fail to create tickets. We have put a fix in place to correct this behavior going forward, and emails missed due to this issue will create tickets as a result of the fix.
Today our email engineers also attempted to re-process the subset of unprocessed emails that failed to create or update tickets between September 6 16:47 UTC to September 13 16:12 UTC which is why you may be seeing these email tickets and updates only being created now.
We apologize for the interruption and confusion this may be causing you and your team. Please let us know if you have any questions or concerns.
POST-MORTEM
Root Cause Analysis
A Uhaul package update caused emails to not create or update tickets.This was the result of a Uhaul work hosts misconfiguration.
Resolution
Once the issue was identified, the work hosts were rolled back to a previous version and the unprocessed emails were then processed.
Remediation Items
- Update Uhaul workhosts version
- Process unprocessed emails
- Improve email monitoring and alerting
- Update Uhaul playbook
- Additional smoke testing
FOR MORE INFORMATION
For current system status information about your Zendesk, check out our system status page. The summary of our post-mortem investigation is usually posted here a few days after the incident has ended. If you have additional questions about this incident, please log a ticket with us via ZBot Messaging within the Widget.