Skip to main content
By default, Layercode agents use automatic turn taking. If you prefer explicit control—press and hold to speak—enable push-to-talk in your agent and wire up the callbacks in your UI.

1) Enable push-to-talk in the dashboard

In your agent panel on https://dash.layercode.com/ → Transcriber → Settings → set Turn Taking to Push to Talk → Save your changes. Select push to talk

2) Use the React SDK callbacks

When using push-to-talk, call triggerUserTurnStarted() when the user begins speaking (pressing the button), and triggerUserTurnFinished() when they stop (releasing the button).
app/ui/VoiceAgentPushToTalk.tsx
'use client';
import { useLayercodeAgent } from '@layercode/react-sdk';

export default function VoiceAgentPushToTalk() {
  const { status, triggerUserTurnStarted, triggerUserTurnFinished } = useLayercodeAgent({
    agentId: process.env.NEXT_PUBLIC_LAYERCODE_AGENT_ID!,
    authorizeSessionEndpoint: '/api/authorize',
  });

  return (
    <button
      className="h-12 px-4 rounded-full flex items-center gap-2 justify-center bg-black text-white"
      onMouseDown={triggerUserTurnStarted}
      onMouseUp={triggerUserTurnFinished}
      onMouseLeave={triggerUserTurnFinished}
      onTouchStart={triggerUserTurnStarted}
      onTouchEnd={triggerUserTurnFinished}
    >
      {status === 'connected' ? 'Hold to Speak' : 'Connecting…'}
    </button>
  );
}
Turn taking is explained conceptually in our Turn taking guide.

What gets sent when you press the button

The React SDK continuously captures microphone audio into a short rolling buffer even while the button is idle. When you call triggerUserTurnStarted(), we immediately flush roughly one second of pre-roll audio along with anything you speak while the button stays down. This keeps the start of the utterance intact, so agents hear the full word instead of a clipped syllable. You can fine-tune the pre-roll length with the vad.buffer_frames agent setting. Each frame represents about 100 ms of audio, so lowering the value shortens the buffer and raising it adds more context before the press.
I