Chat Conversational Assistant showing Latency [PROD 1]
Incident Report for Kustomer
Postmortem

Summary

On May 18, 2022, elevated latency and error rates with the conversational assistant and workflows were noticed. Scaling up related services allowed workflows to function as expected and the conversational assistant at a degraded capacity. Upon resolving the root cause, the traffic on all related services quickly dropped and was running at normal capacity.

Root Cause

An assisted conversation entered a bad state that elevated traffic in the conversational assistant and workflow at 1:35 PM EST. Related services were scaled to match the increase in traffic soon after, stabilizing the platform. Further investigation indicated that the error rates had not been fully resolved. The related assistant was modified to alleviate any issues and affected service latencies quickly dropped.

Lessons/Improvements

  • Add additional safeguards to prevent infinite running cycles in our system.
  • Add advanced detection of these infinite cycles to allow us to recover from them more quickly.
Posted Jun 06, 2022 - 12:40 EDT

Resolved
Kustomer has resolved an event affecting Chat Conversational Assistant processing latency. After careful monitoring, our team has found that all affected areas are fully restored. Please reach out to support if you have additional questions or concerns.
Posted May 18, 2022 - 16:35 EDT
Update
We are beginning to see errors subside, but are still working to resolve potential latency for Conversational Assistant processing.
Posted May 18, 2022 - 15:30 EDT
Identified
Kustomer has identified an event affecting the Conversational Assistant for Chat that may cause latency or errors. Our team is currently working to design and implement a resolution. Please expect further updates as we continue to investigate and reach out to support if you have additional questions or concerns.
Posted May 18, 2022 - 14:11 EDT
This incident affected: Prod1 (US) (Channel - Chat).