SUMMARY
On March 5, 2024 from 15:30 UTC to 18:25 UTC, Sunshine Conversations customers across all Pods using Social Messaging channels may have experienced a significant disruption in service due to an outage at Meta. This outage primarily affected WhatsApp, with substantial impacts also on Facebook Messenger and Instagram, leading to a partial interruption in outbound traffic through these channels.
Timeline
16:30 UTC | 08:30 PT
We have confirmed an issue impacting all social messaging channels powered by Meta, including Facebook, Messenger, WhatsApp, and Instagram. We will provide another update when our partner provider has resolved the behavior.
17:51 UTC | 09:51 PT
We are beginning to see recovery from the issue impacting use of all social messaging channels powered by Meta. We will continue to work with our partner provider and monitor the situation until full resolution.
19:15 UTC | 11:15 PT
We are happy to report that the issue causing delays in incoming and outgoing messages on all social messaging channels powered by Meta has been resolved, and these channels are processing messages as expected at this time. Thank you for your patience as we worked with our partner provider.
POST-MORTEM
Root Cause Analysis
This incident was caused by an issue with Meta's services, specifically impacting the WhatsApp Cloud API. The exact root cause was not disclosed by Meta, but they acknowledged the disruption and worked on restoring services. During this time, Sunshine Conversations' monitoring systems detected a high occurrence of errors when attempting to send messages through the affected channels.
Resolution
The resolution of the outage was entirely dependent on Meta's recovery efforts. Sunshine Conversations monitored the situation closely and provided updates to customers as they became available. Once Meta announced that services were restored, Sunshine Conversations verified that outbound traffic resumed to normal levels and that messages were being successfully delivered through the social messaging channels.
Remediation Items
- Improve monitoring for outbound traffic health.
- Review and update the incident response plan to include alternative communication channels with Meta in case of their primary support portal being unavailable.
FOR MORE INFORMATION
For current system status information about your Zendesk, check out our system status page. The summary of our post-mortem investigation is usually posted here a few days after the incident has ended. If you have additional questions about this incident, please log a ticket with us via ZBot Messaging within the Widget.