Issues Accessing Kustomer (Prod 2)
Incident Report for Kustomer
Postmortem

Summary

On October 14th, 2021, from 8:30AM EST - 8:40AM EST, customers in the Prod2 Pod were unable to access the Kustomer platform due to a failed deployment. The platform would have been inaccessible and inbound messages would have been dropped during this period.

Root Cause

A failed deployment in one of our core services that handles all traffic external to our servers rendered the core service unavailable during this time period. Because requests made from the web platform are all external to our servers, the platform was inaccessible during this time.

Timeline

10/13 11:33AM - Deployment to the core service fails. Notification appears in slack but is missed. All operations are normal given the pre-existing healthy instances of this service.
10/13 - 10/14 - Overnight the instances naturally are set to be replaced. Prod2 Pod instances fail to be replaced because of failed deployment
10/14 8:30AM - Engineering team is alerted of Prod2 Pod core service errors
10/14 8:30AM - 8:40AM - With no instances available to accept traffic, the platform is rendered unusable during this time. Our engineers identify the problem, rollback the change and the platform is fully available and functional.

Lessons/Improvements

  • Engineers will look to improve Prod2 Pod monitoring.
  • Engineers will look into adding better alerting for failed deployments. Better alerting for failed deployments may have helped prevent this issue
Posted Oct 18, 2021 - 09:17 EDT

Resolved
The issues with access to the platform for Prod2 have been resolved.

Please reach out to our Support team with any additional questions. You can reach us by going to https://help.kustomer.com/ and clicking "Contact Support" at the top of the page.
Posted Oct 14, 2021 - 08:48 EDT
Investigating
Kustomer is currently experiencing issues accessing the platform for clients in Prod2. We are working to resolve the issue as quickly as possible. During this time you may have issues accessing the Kustomer on web.

Please reach out to our Support team with any additional questions. You can reach us by going to https://help.kustomer.com/ and clicking "Contact Support" at the top of the page.
Posted Oct 14, 2021 - 08:43 EDT
This incident affected: Prod2 (EU) (Analytics, API, Bulk Jobs, Channel - Chat, Channel - Email, Channel - Facebook, Channel - Instagram, Channel - SMS, Channel - Twitter, Channel - WhatsApp, CSAT, Events / Audit Log, Exports, Notifications, Registration, Search, Tracking, Web Client, Web/Email/Form Hooks, Workflow, Knowledge base).