Keeping track of conversation history

Tracking conversation history seems easy. But there is one big gotcha - webhook requests can abort. And it’s common in voice because of interruptions. And so we need to adjust our approach. Let’s start naively. A user sends a message, so we add it to an array.

[
  { "role": "user", "turn_id": "turn-1", "content": "Hey, how do I make a hot dog?" }
]

And then when the assistant replies, we simply append it:

[
  { "role": "user", "turn_id": "turn-1", "content": "Hey, how do I make a hot dog?" },
  { "role": "assistant", "turn_id": "turn-2", "content": "You put the frankfurter in the bun and add some mustard." }
]

But what if the user interrupts?

When the user interrupts mid-response, the webhook request that was generating the assistant’s reply is abruptly terminated.
Unless we’ve already written something to memory, the assistant’s partial message could be lost. In practice, this happens a lot with voice agents — users cut off the model to ask something new before the previous response finishes.
If we don’t handle this carefully, our in-memory state drifts out of sync with what actually happened in the conversation. And you might not even realize, and think the LLM is just being a silly billy.

So what do I need to do?

When a new user webhook arrives, persist in this order:

Store the user message right away so the turn is anchored in history.
Insert the assistant placeholder before you start streaming tokens back.

conversationMessages[conversation_id].push({
  role: "user",
  turn_id,
  content: userInput
});

conversationMessages[conversation_id].push({
  role: "assistant",
  turn_id,
  content: "" // placeholder
});

If the webhook completes successfully:

Remove the placeholder and append final messages with the same turn_id.

If the webhook is aborted:

The placeholder remains, capturing the interrupted turn.
The next message webhook includes interruption_context, which tells us which assistant_turn_id was cut off.
You can reconcile by marking that entry as interrupted.

Example Interruption Handling

if (interruption_context?.assistant_turn_id) {
  const prev = conversationMessages[conversation_id];
  const interrupted = prev.find(
    (m) => m.role === "assistant" && m.turn_id === interruption_context.assistant_turn_id
  );
  if (interrupted) {
    interrupted.content += " [interrupted]";
  }
}

This ensures that when the next user turn arrives, the model still sees every turn — even those that were cut off.

Why doesn’t the assistant finish the turn?

When a user interrupts, Layercode immediately cancels the webhook request that was streaming the assistant response.
Because the request terminates, your worker never has a chance to finalize the response or append it to history.
There is currently no back-channel for Layercode to notify your backend gracefully — cancelling the request is the only interruption signal we can provide. This is why persisting the placeholder before you stream tokens is essential.

Do I get an `AbortSignal`?

Layercode does not propagate a custom AbortSignal into your AI SDK calls.
Instead, the framework relies on the platform aborting the request (Cloudflare Workers receive the native ExecutionContext cancellation). Make sure any long-running model or fetch calls can tolerate the request being torn down mid-stream; the placeholder you stored lets you recover once the next webhook arrives.

What about multiple interruptions in a row?

Even if a user interrupts several turns back-to-back, Layercode only sends interruption_context for the immediately previous assistant turn.
Persist that context as soon as the new webhook starts (before any expensive work) so it survives if another interruption happens quickly afterward. The placeholder pattern above keeps your transcript accurate even during rapid-fire interrupts.

Stored Message Shape and `turn_id`

Every stored message (user and assistant) includes a turn_id corresponding to the webhook event that created it:

{ role: 'user', turn_id: <session.start turn>, content: '...' }
{ role: 'assistant', turn_id: <same turn_id>, content: '...' }

The initial system message does not have a turn_id.

Persistence Notes

There is no deduplication or idempotency handling yet in Layercode. So you will need to write logic to filter this.

TL;DR

✅ Always store user messages immediately.
✅ Add a placeholder assistant message before streaming.
✅ Replace or mark the placeholder when the turn finishes or is interrupted.
✅ Never rely on the webhook completing — it might abort anytime.
✅ Keep turn_id and conversation_id consistent for reconciliation.

Overview

SDKs

How-to guides

Explanations

Keeping track of conversation history

But what if the user interrupts?

So what do I need to do?

Example Interruption Handling

Why doesn’t the assistant finish the turn?

Do I get an `AbortSignal`?

What about multiple interruptions in a row?

Stored Message Shape and `turn_id`

Persistence Notes

TL;DR

Overview

SDKs

How-to guides

Explanations

​But what if the user interrupts?

​So what do I need to do?

​Example Interruption Handling

​Why doesn’t the assistant finish the turn?

​Do I get an AbortSignal?

​What about multiple interruptions in a row?

​Stored Message Shape and turn_id

​Persistence Notes

​TL;DR

But what if the user interrupts?

So what do I need to do?

Example Interruption Handling

Why doesn’t the assistant finish the turn?

Do I get an `AbortSignal`?

What about multiple interruptions in a row?

Stored Message Shape and `turn_id`

Persistence Notes

TL;DR