SUMMARY
On March 26, 2024 from 10:54 UTC to 13:00 UTC, a significant number of Tymeshift customers across multiple Pods experienced difficulties logging into their accounts. The issue manifested as either access errors or a failure to load the login page.
Timeline
12:15 UTC | 05:15 PT
We are aware of an issue that is preventing Tymeshift customers from logging in. Investigation is underway and we will provide another update shortly.
12:33 UTC | 05:33 PT
We are continuing our investigation into the root cause of customers not being able to login to Tymeshift. We will provide more information in 30 mins.
13:00 UTC | 06:00 PT
We are currently testing a possible revert fix for the issue across multiple Pods affecting Tymeshift customers’ ability to log in. We will provide more information in 30 mins.
13:07 UTC | 06:07 PT
We have now rolled out a fix to all Pods to resolve the login issue for Tymeshift customers. Please attempt to log into your account as usual at this time and let us know if you encounter any further problems.
13:25 UTC | 06:25 PT
We no longer observe login errors in the backend, and customers have confirmed that they can successfully access their accounts. With this, we consider the incident resolved. Thank you.
POST-MORTEM
Root Cause Analysis
The incident was caused by an incorrect configuration setting following a recent update to our sign-in process. A discrepancy in the naming conventions used within our system led to the use of an outdated method for verifying user identities, which was not compatible with our current system. This resulted in users facing errors when attempting to log in, as the system failed to recognize their credentials.
Resolution
To fix this issue, the incorrect configuration was reverted to a previous version that did not include the problematic change. This rollback restored the system's ability to authenticate users correctly.
Remediation Items
- Improve existing implementation tools, establishing a clear step-by-step plan for updating our login process to make sure all parts of our system work together smoothly.
- Create additional smoke tests.
- Ensure that updates are moved from our test environment to the real world without any hiccups.
FOR MORE INFORMATION
For current system status information about your Zendesk, check out our system status page. The summary of our post-mortem investigation is usually posted here a few days after the incident has ended. If you have additional questions about this incident, please log a ticket with us via ZBot Messaging within the Widget.