But what if the user interrupts?
When the user interrupts mid-response, the webhook request that was generating the assistant’s reply is abruptly terminated.Unless we’ve already written something to memory, the assistant’s partial message could be lost. In practice, this happens a lot with voice agents — users cut off the model to ask something new before the previous response finishes.
If we don’t handle this carefully, our in-memory state drifts out of sync with what actually happened in the conversation. And you might not even realize, and think the LLM is just being a silly billy.
So what do I need to do?
When a new user webhook arrives, persist in this order:- Store the user message right away so the turn is anchored in history.
- Insert the assistant placeholder before you start streaming tokens back.
- Remove the placeholder and append final messages with the same
turn_id
.
- The placeholder remains, capturing the interrupted turn.
- The next
message
webhook includesinterruption_context
, which tells us whichassistant_turn_id
was cut off. - You can reconcile by marking that entry as interrupted.
Example Interruption Handling
Stored Message Shape and turn_id
Every stored message (user and assistant) includes a turn_id
corresponding to the webhook event that created it:
turn_id
.
Persistence Notes
- There is no deduplication or idempotency handling yet in Layercode. So you will need to write logic to filter this.
TL;DR
✅ Always store user messages immediately.✅ Add a placeholder assistant message before streaming.
✅ Replace or mark the placeholder when the turn finishes or is interrupted.
✅ Never rely on the webhook completing — it might abort anytime.
✅ Keep
turn_id
and conversation_id
consistent for reconciliation.