Recent searches
No recent searches
![Billy Macken's Avatar](https://secure.gravatar.com/avatar/677a05f75a279baef9b571484b496798?default=https%3A%2F%2Fassets.zendesk.com%2Fhc%2Fassets%2Fdefault_avatar.png&r=g)
Billy Macken
Joined Apr 14, 2021
·
Last activity Feb 13, 2025
Following
0
Followers
2
Total activity
123
Votes
0
Subscriptions
87
ACTIVITY OVERVIEW
BADGES
ARTICLES
POSTS
COMMUNITY COMMENTS
ARTICLE COMMENTS
ACTIVITY OVERVIEW
Latest activity by Billy Macken
Billy Macken created an article,
SUMMARY
February 13, 2025 07:56 PM UTC | February 13, 2025 11:56 AM PT
We are happy to report that the issue preventing new Guide article updates from being published in pod 13 has been resolved, and Guide article updates are correctly being reflected in the Help Center. Thank you for your patience during our investigation.
February 13, 2025 07:46 PM UTC | February 13, 2025 11:46 AM PT
We have confirmed an issue preventing new Guide article updates from being published on pod 13. Our team is investigating and we will provide new information as soon as it's available.
February 13, 2025 07:31 PM UTC | February 13, 2025 11:31 AM PT
We are receiving reports of issues publishing changes to Guide articles on Pod 13. We will provide further updates shortly.
POST-MORTEM
TBD
FOR MORE INFORMATION
For current system status information about Zendesk and specific impacts to your account, visit our system status page. You can follow this article to be notified when our post-mortem report is published. If you have additional questions about this incident, contact Zendesk customer support.
Edited Feb 13, 2025 · Billy Macken
0
Followers
3
Votes
0
Comments
Billy Macken created an article,
SUMMARY
February 13, 2025 05:37 PM UTC | February 13, 2025 09:37 AM PT
We are happy to report that we have resolved the issue affecting some AI Agents causing technical errors, and as such will be restoring the Z2 Widget to contact Zendesk Support. Thank you for your patience during our investigation.
February 13, 2025 04:14 PM UTC | February 13, 2025 08:14 AM PT
We are aware of an issue affecting a subset of AI Agents, including those on our Z2 Widget to contact Zendesk Support (https://status.openai.com/). While we investigate, we will be failing over to the web form for any requests to Zendesk Support.
POST-MORTEM
TBD
FOR MORE INFORMATION
For current system status information about Zendesk and specific impacts to your account, visit our system status page. You can follow this article to be notified when our post-mortem report is published. If you have additional questions about this incident, contact Zendesk customer support.
Edited Feb 13, 2025 · Billy Macken
1
Follower
2
Votes
0
Comments
Billy Macken created an article,
SUMMARY
On January 31, 2025 from 13:15 UTC to 17:41 UTC customers experienced some agents and end-users appearing to be suspended in the UI when they should not have been. These users were not actually suspended, and after rolling back an update and having customers clear cache, all access was fully restored.
TIMELINE
January 31, 2025 06:15 PM UTC | January 31, 2025 10:15 AM PT
We are happy to report that we have resolved the issue causing some users to be suspended unexpectedly. Thank you for your patience during our investigation.
January 31, 2025 05:52 PM UTC | January 31, 2025 09:52 AM PT
We have rolled back an update which we believe is responsible for the unexpected agent and end-user suspensions, and after clearing cache and refreshing your browser, user access should be restored at this time. Please let us know if you continue to experience any issues.
January 31, 2025 05:48 PM UTC | January 31, 2025 09:48 AM PT
We are still investigating the issue causing unexpected user suspensions across multiple pods. We will provide additional updates in the next hour or when we have new information to share.
January 31, 2025 05:23 PM UTC | January 31, 2025 09:23 AM PT
Our team continues to investigate the issue causing unexpected agent and end-user suspensions across multiple pods. We will post additional updates within the next 30 minutes or as we learn more.
January 31, 2025 04:59 PM UTC | January 31, 2025 08:59 AM PT
We have confirmed an issue causing unexpected agent and end-user suspensions across multiple pods, and our team is investigating. Further updates will be posted within the next 30 minutes.
January 31, 2025 04:46 PM UTC | January 31, 2025 08:46 AM PT
We are receiving reports of unexpected agent and end-user suspensions, and our team is investigating. More information will be posted shortly.
POST-MORTEM
Root Cause Analysis
This incident was caused by a deployment that replaced an implementation on User Settings. The new implementation failed to correctly handle NULL values, leading the system to incorrectly identify users as suspended. The previous system converted NULL to an empty string, which was recognized as a false value, while the new implementation did not perform this conversion, resulting in a true return for suspended status when it should have been false.
Resolution
To resolve the issue, the team initiated a rollback of the problematic deployment. This action restored normal functionality, allowing users to regain access to their accounts. Additionally, a browser cache refresh was conducted to ensure the fix was effective.
Remediation Items
- Data Migration: Conduct a migration to correct NULL values in the suspended settings to prevent similar issues in the future.
- Deployment Context: Ensure that future deployment attempts include the proper context and learnings from this incident.
- Unit Test Review: Add unit tests to cover scenarios involving NULL values to ensure robust validation in future changes.
- Data Coverage Check: Review data to ensure all instance values are adequately covered and accounted for.
FOR MORE INFORMATION
For current system status information about Zendesk and specific impacts to your account, visit our system status page. You can follow this article to be notified when our post-mortem report is published. If you have additional questions about this incident, contact Zendesk customer support.
Edited Feb 12, 2025 · Billy Macken
0
Followers
4
Votes
1
Comment
Billy Macken created an article,
Zendesk will perform critical maintenance which will impact performance for customers on Pods 13, 20, and 25 on February 10-11, 2025.
Affected products: Support, Talk, & Guide
Date |
Start Time |
End Time |
|
February 10, 2025 |
25 |
18:00 UTC / 10:00 PST |
18:30 UTC / 10:30 PST |
February 11, 2025 |
20 |
02:00 UTC / 18:00 PST (Feb 10) |
02:30 UTC / 18:30 PST (Feb 10) |
February 11, 2025 |
13 |
02:00 UTC / 18:00 PST (Feb 10) |
02:30 UTC / 18:30 PST (Feb 10) |
Expected behavior: The Support and Guide interfaces may briefly be unreachable for 30 seconds or less, and your agents may experience server error screens, sluggish response times, issues bulk updating tickets, refreshing ticket views, and Talk dropped calls.
Please note that backend processes will still occur without issue, so you can expect email processes, API requests, and other such requests to function properly during these maintenance windows.
Why we're doing this: The Zendesk Relational Storage team is making configuration changes to update the database infrastructure.
Edited Jan 15, 2025 · Billy Macken
1
Follower
2
Votes
0
Comments
Billy Macken created an article,
Summary
On December 11, 2024 from 18:06 UTC to 22:57 UTC, Sunshine Conversations customers experienced message delivery issues in WhatsApp, Instagram and Messenger communication channels.
Timeline
December 11, 2024 06:15 PM UTC | December 11, 2024 10:14 AM PT
We are aware of a partner provider outage affecting WhatsApp, Messenger, and Instagram. We are working with the provider to restore services and will provide further updates soon.
December 11, 2024 06:35 PM UTC | December 11, 2024 10:35 AM PT
Our team continues their work with our partner provider to restore service to WhatsApp, Messenger, and Instagram. We will provide another update in one hour or when we have new information to share.
December 11, 2024 07:37 PM UTC | December 11, 2024 11:37 AM PT
We are still working with our partner provider to restore service to WhatsApp, Messenger, and Instagram channels. We will provide additional updates when we have new information to share.
December 11, 2024 11:01 PM UTC | December 11, 2024 3:01 PM PT
We are beginning to see some recovery from the provider outage affecting WhatsApp, Messenger, and Instagram. Our partner provider is still competing their work, and as such we will continue to monitor our systems to ensure that we have fully recovered. Please let us know if you continue to experience any issues.
December 12, 2024 02:00 AM UTC | December 11, 2024 6:00 PM PT
We are happy to report that the provider outage affecting WhatsApp, Messenger, and Instagram has been resolved. Thanks for your patience while we worked through today's issue.
Root Cause Analysis
This incident was caused by an issue with Meta's Cloud API services impacting Facebook, Messenger, WhatsApp, and Instagram. The exact root cause was not disclosed by Meta, but they acknowledged the disruption and worked on restoring services. During this time, Sunshine Conversations' monitoring systems detected a high occurrence of errors when attempting to send messages through the affected channels.
Resolution
The resolution of the outage was entirely dependent on Meta's recovery efforts. Sunshine Conversations engineers monitored the situation closely and the Zendesk incident team provided updates to customers as they became available. Once Meta announced that services were restored, we verified that outbound traffic resumed to normal levels and that messages were being successfully delivered through the social messaging channels.
Remediation Items
- Review and improve Meta edge case monitoring.
- Update Meta escalation procedures.
FOR MORE INFORMATION
For current system status information about your Zendesk, check out our system status page. The summary of our post-mortem investigation is usually posted here a few days after the incident has ended. If you have additional questions about this incident, contact Zendesk customer support.
Edited Dec 19, 2024 · Billy Macken
0
Followers
3
Votes
1
Comment
Billy Macken commented,
Postmortem published December 5, 2024.
View comment · Posted Dec 05, 2024 · Billy Macken
0
Followers
0
Votes
0
Comments
Billy Macken created an article,
SUMMARY
December 3, 2024 04:24 PM UTC | December 3, 2024 8:24 AM PT
Between 15:09 and 15:37 UTC today, December 3, 2024, Explore customers in EU experienced 5xx level errors and access issues reaching dashboards and reports. These issues have been resolved and access has been restored at this time.
POST-MORTEM
Root Cause Analysis
The incident was caused by a mistake in a technical command during a system update. This error led to a misdirection of traffic, causing issues for our European customers trying to access their Explore dashboards.
Resolution
Our technical team quickly corrected the issue by redirecting the traffic to the appropriate system, which restored access to the Explore dashboards for all affected customers.
Remediation Items
-
Review Procedures: We will re-evaluate our operational procedures to ensure that all technical settings are correct before making changes.
-
Improve Technical Tools: Enhancements will be made to our traffic management tools to prevent similar errors in the future.
-
Reduce Single Points of Failure: We are investigating ways to ensure that our systems are more resilient and do not rely on a single component, improving overall reliability.
-
Strengthen Review Processes: We will implement more thorough checks for our technical procedures to catch potential issues before they affect our customers.
-
Enhance Service Reliability: We will review our service performance standards to ensure we consistently meet high expectations for uptime and reliability.
FOR MORE INFORMATION
For current system status information about your Zendesk, check out our system status page. The summary of our post-mortem investigation is usually posted here a few days after the incident has ended. If you have additional questions about this incident, contact Zendesk customer support.
Edited Dec 09, 2024 · Billy Macken
0
Followers
2
Votes
1
Comment
Billy Macken commented,
Postmortem published November 13, 2024.
View comment · Posted Nov 13, 2024 · Billy Macken
0
Followers
0
Votes
0
Comments
Billy Macken created an article,
Summary
On October 28, 2024 from 17:37 UTC to 19:19 UTC, a small subset of Zendesk Explore customers on Pods 13, 15, 19, 20, 23, 25, 26 and 27 experienced various errors in the product when performing tasks such as generating reports and loading dashboards.
Timeline
October 28, 2024 06:51 PM UTC | October 28, 2024 11:51 AM PT
We are receiving reports of issues loading reports and dashboards in Explore across multiple pods and our team is investigating. More updates to follow shortly.
October 28, 2024 07:02 PM UTC | October 28, 2024 12:02 PM PT
We have confirmed an issue affecting Explore customers causing 502 errors and latency when attempting to load default and custom dashboards and reports. Our team is investigating and we will provide further updates within the next 30 minutes.
October 28, 2024 07:21 PM UTC | October 28, 2024 12:21 PM PT
We have rolled back a recent update and are beginning to see improvement in the issue affecting Explore customers, causing 502 errors and latency when attempting to load dashboards and reports. We will continue to monitor until full recovery. Please let us know if you continue to experience any issues.
October 28, 2024 07:38 PM UTC | October 28, 2024 12:38 PM PT
We are happy to report that the issue affecting Explore dashboard and report loading has been resolved. Thank you for your patience during our investigation.
Root Cause Analysis
This incident was caused by a network configuration error that resulted in connectivity timeouts between network infrastructure components. This led to customers receiving HTTP request errors in the Explore product.
Resolution
To fix this issue, our team rolled back the network configuration changes in the affected nodes which resulted in the service being restored.
Remediation Items
The following work items have been scheduled:
- Update Explore startup / readiness probes to prevent rollouts in some scenarios.
- Investigate automating some elements of the affected network configuration paths
- Review monitoring and alerting
- Update infrastructure change runbooks.
FOR MORE INFORMATION
For current system status information about your Zendesk, check out our system status page. The summary of our post-mortem investigation is usually posted here a few days after the incident has ended. If you have additional questions about this incident, contact Zendesk customer support.
Edited Nov 14, 2024 · Billy Macken
1
Follower
3
Votes
1
Comment
Billy Macken created an article,
SUMMARY
On October 14, 2024 from 13:49 UTC to 15:40 UTC, Customers using Explore in the AMER region experienced "download failed" errors when attempting to export or schedule dashboards and reports.
TIMELINE
October 14, 2024 04:17 PM UTC | October 14, 2024 09:17 AM PT
We are happy to report that we have resolved the issue affecting Explore customers in the Americas, causing "download failed" errors when attempting to export or schedule dashboards and reports. Thank you for your patience during our investigation.
October 14, 2024 04:01 PM UTC | October 14, 2024 09:01 AM PT
We have found a root cause for the issue affecting US Explore customers causing "download failed" errors when attempting to download or schedule dashboards or reports; however, there is a backlog of requests that need to be processed and some delays may be experienced. We will monitor to ensure full resolution. Please let us know if you continue to experience any issues.
October 14, 2024 03:40 PM UTC | October 14, 2024 08:40 AM PT
We have confirmed an issue affecting US Explore customers causing "download failed" errors when attempting to download or schedule dashboards or reports. Our team is investigating and we will post further updates in the next 30 minutes.
October 14, 2024 03:26 PM UTC | October 14, 2024 08:26 AM PT
We are receiving reports of "download failed" errors for US Explore customers when attempting to download or schedule dashboards or reports. We will post additional information shortly.
POST-MORTEM
Root Cause Analysis
This incident was caused by the inadvertent deletion of a secret which was needed for services to authenticate within Explore. The deletion occurred during the cleanup process of Explore resources, where it was mistakenly assumed that the secret was no longer needed since it was available in a new version of the service.
Resolution
To fix this issue, the missing secret was recreated, allowing the service to start successfully again. This involved manual intervention to reapply the secret definitions through the codebase, ensuring that all necessary components were functioning as intended.
Remediation Items
- Increase the required number of reviewers to two on the relevant repository to enhance oversight on changes.
- Document the process for validating whether a secret on our previous version is still in use by other services.
- Develop a documented process for validating risk infrastructure changes using the staging environment and end-to-end tests.
- Establish guidelines for rolling out risk infrastructure changes to production, including appropriate soaking time.
- Investigate and address memory issues related to the Explore services to prevent future occurrences of similar incidents.
FOR MORE INFORMATION
For current system status information about your Zendesk, check out our system status page. The summary of our post-mortem investigation is usually posted here a few days after the incident has ended. If you have additional questions about this incident, contact Zendesk customer support.
Edited Oct 30, 2024 · Billy Macken
0
Followers
2
Votes
1
Comment