SUMMARY
From August 30, 2024 12:16 PM UTC to September 2, 2024 09:15 AM UTC, customers using Zendesk Talk for SMS and calls encountered an issue where, newly created SMS end users had an invalid phone number displayed, as an additional "1" was erroneously appended to the end of their numbers. Consequently, agents were unable to send messages or initiate calls to these end users.
Timeline
September 04, 2024 01:05 PM UTC | September 04, 2024 06:05 AM PT
Between 12:16 PM UTC on August 30th and 09:15 AM UTC on September 2nd, customers using Talk and Support reported an extra “1” being added to end-user phone numbers. This issue impacted their ability to make calls and required manual corrections. Any customer who received SMS messages in the last five days would have been affected. On Monday, we identified the root cause and rectified the issue. Unfortunately, the previously deployed fix was accidentally rolled back yesterday. However, our engineers have re-deployed the fix, and the issue is now fully resolved. The fix ensures that any new user identities created during the active period of the fix do not have the rogue “1" appended. Additionally, the temporary rollback has been corrected, ensuring that no new user identities are created with the erroneous “1”.
POST-MORTEM
Root Cause Analysis
This incident occurred due to an unintended consequence of a cleanup pull request (PR) that reactivated an old rollout process from 2018. This process automatically updated new SMS end users' phone numbers, adding an "sms_capability" attribute with a value of "1". The extension model fetched this information, resulting in an erroneous “1” being appended to the phone numbers of new SMS end users.
Resolution
To resolve this issue, the problematic code responsible for adding the "1" to new SMS end users' phone numbers was identified and disabled. Ensuring that the sms_capability attribute is no longer incorrectly applied to phone numbers.
Remediation Items
- Fix the User Phone Extension model to pick the appropriate records to add phone number extension to end user numbers. [Done]
- Eliminate unused phone identity update code from the SMS system, as it was discovered to be outdated and malfunctioning. [To do]
- Clean up any existing user_phone_attributes that were incorrectly created with the sms_capability attribute. [To do]
- Implement an automated test to ensure that sending a message to a Zendesk Text number results in a correctly formatted ticket and new end user creation. This test should simulate calling the end user to ensure their phone number is valid and not appended with an erroneous digit. [To do]
- Develop a more sophisticated alerting system that can differentiate between truly invalid numbers and false positives to minimize unnecessary alerts while still catching legitimate issues. [To do]
FOR MORE INFORMATION
For current system status information about your Zendesk, check out our system status page. The summary of our post-mortem investigation is usually posted here a few days after the incident has ended. If you have additional questions about this incident, contact Zendesk customer support.
1 comment
Jessica G.
Post-mortem published September 9, 2024.
0