On December 16, 2019 from 20:43 UTC to 21:45 UTC customers using Zendesk Guide on Pods 18 may have seen a server error within Help Center as around 15% of total Help Center traffic experienced server errors during this time.
22:28 UTC | 14:28 PT
We're happy to report that the issue affecting end-users' ability to access Guide on some Pod 18 accounts has been resolved. Our team will continue to monitor as they remediate a root cause.
21:57 UTC | 13:57 PT
We're still working to resolve the issues affecting access to Guide articles for some customers. More details as our investigation continues.
20:55 UTC | 12:55 PT
We're currently investigating reports of some users on Pod 18 accounts unable to access Guide.
Root Cause Analysis
This incident was caused by a configuration error setting the rotation of the Nginx logs to be performed daily rather than on an hourly basis. Once the partition volumes became full, Nginx on Guide Proxies was unable to write temporary files on the disk before sending the response back to the client, which caused the 502 Bad Gateway server errors.
To fix this issue, Nginx logs were manually rotated and the capacity of the partition volume was increased.
- Created monitors for the new host group
- Improve proxy host configuration to ensure sufficient capacity
FOR MORE INFORMATION
For current system status information about your Zendesk, check out our system status page. During an incident, you can also receive status updates by following @ZendeskOps on Twitter. The summary of our post-mortem investigation is usually posted here a few days after the incident has ended. If you have additional questions about this incident, please log a ticket with us.