# CLI Reference Source: https://docs.layercode.com/api-reference/cli Layercode CLI command reference and usage guide. ## Installation You’ll need **npm** installed to use the CLI.\ We recommend running commands with `npx` instead of installing globally. ```bash theme={null} npx @layercode/cli ``` *** ## Commands ### `login` ```bash theme={null} npx @layercode/cli login ``` Opens a browser window to log in and link your terminal to your Layercode account. *** ### `init` ```bash theme={null} npx @layercode/cli init [--agent ] ``` Initializes Layercode locally, creating an example project and linking an agent. **Flags** * `--agent `: (optional) Link an existing agent.\ If not provided, a new agent will be created. *** ### `tunnel` ```bash theme={null} $ layercode tunnel [--agent-id ] [--path ] [--port ] [--tail] ``` Runs local project with cloudflared tunnel and updates your agent’s webhook URL in the Layercode dashboard. **Flags** * `agent-id=` Specify the unique identifier of the agent. If no agent-id is provided, it will look for an env variable ending with `LAYERCODE_AGENT_ID` from the .env file. If also not found, command will fail. * `path=` \[default: /api/agent] Set the API path to append for the agent endpoint. * `port=` \[default: 3000] Port number to run the tunnel on. * `tail` Continuously stream logs, including CLI messages **Equivalent to:** ```bash theme={null} cloudflared tunnel --url http://localhost: ``` *** ## Example Usage ```bash theme={null} # Log in npx @layercode/cli login # Initialize a new local setup npx @layercode/cli init # Start a tunnel for agent abc123 on port 5173 with the api found at /api/voice-agent npx @layercode/cli tunnel --agent=dtv3x3d2 --port=5173 path=/api/voice-agent --tail ``` *** ## Troubleshooting If you encounter issues: * Ensure **npm** and **Node.js** are installed and up-to-date. * Try logging out and back in with `npx @layercode/cli login`. * By default, your tunnel will set your webhook url path as /api/agent. You should update this with the --path flag based on where your webhook endpoint is inside your application e.g. '/api/agent' or at the root '/' or '/voice-agent'. See our guide on [webhooks for more details](/explanations/webhooks) # Frontend WebSocket API Source: https://docs.layercode.com/api-reference/frontend-ws-api Layercode WebSocket API for browser and mobile based voice agent experiences. The Layercode Frontend WebSocket API is used to create browser and mobile based voice agent experiences. The client browser streams chunks of base64 microphone audio down the WebSocket. In response, the server returns audio chunks of the assistant's response to be played to the user. Additional trigger and data event types allow control of turns and UI updates. For most use cases, we recommend using our SDKs for React ([React Guide](/tutorials/react)) or Vanilla JS ([Vanilla JS Guide](/tutorials/vanilla_js)). This API reference is intended for advanced users who need to implement the WebSocket protocol directly. # Connecting to the WebSocket The client browser connects to the Layercode WebSocket API at the following URL: ``` wss://api.layercode.com/v1/agents/web/websocket ``` ## Authorizing the WebSocket Connection When establishing the WebSocket connection, the following query parameter must be included in the request URL: * `client_session_key`: A unique session key obtained from the Layercode REST API `/authorize` endpoint. Example full connection URL: ``` wss://api.layercode.com/v1/agents/web/websocket?client_session_key=your_client_session_key ``` To obtain a client\_session\_key, you must first create a new session for the user by calling the [Layercode REST API /authorize](/api-reference/rest-api#authorize) endpoint. This endpoint returns a client\_session\_key which must be included in the WebSocket connection parameters. This API call should be made from your backend server, not the client browser. This ensures your LAYERCODE\_API\_KEY is never exposed to the client, and allows you to do any additional user authorization checks required by your application. # WebSocket Events ## Client → Server Messages ### Client Ready When the client has established the WebSocket connection and is ready to begin streaming audio, it should send a ready message: ```json theme={null} { "type": "client.ready" } ``` ### Audio Streaming At WebSocket connection, the client should constantly send audio chunks of the user's microphone in the format below. The content must be the following format: * Base64 encoded * 16-bit PCM audio data * 8000 Hz sample rate * Mono channel See the [Vanilla JS SDK code](https://github.com/layercodedev/packages-and-docs/tree/main/packages/layercode-js-sdk/src) for an example of how browser microphone audio is correctly encoded to base64. ```json theme={null} { "type": "client.audio", "content": "base64audio" } ``` ### Voice Activity Detection Events The client can send Voice Activity Detection (VAD) events to inform the server about speech detection. This will improve the speed and accuracy of automatic turn taking: VAD detects voice activity: Note: The client is responsible for stopping any in-progress assistant audio playback when the user interrupts. ```json theme={null} { "type": "vad_events", "event": "vad_start" } ``` Detected voice activity ends: ```json theme={null} { "type": "vad_events", "event": "vad_end" } ``` Client could not load the VAD model, so VAD events won't be sent: ```json theme={null} { "type": "vad_events", "event": "vad_model_failed" } ``` ### Response Audio Replay Finished The client will receive audio chunks of the assistant's response (see [Audio Response](#audio-response)). When the client has finished replaying all assistant audio chunks in its buffer it must reply with 'client.response\_audio\_replay\_finished' Note that the assistant webhook can return response.tts events (which are turned into speech and received by the client as response.audio events) at any point during a long response (in between other text or json events), so the client must handle situations where it's played all the audio in the buffer, but then receives more to play. This will result in the client sending multiple 'trigger.response.audio.replay\_finished' completed events over a single turn. ```json theme={null} { "type": "trigger.response.audio.replay_finished", "reason": "completed", "turn_id": "UUID of assistant response" } ``` ### Push-to-Talk Control (Optional) In push-to-talk mode (read more about [Turn Taking](/explanations/turn-taking)), the client must send the following events to start and end a user turn to speak. This is typically connected to a button which is held down for the user to speak. In this mode, the client can also preemptively halt the assistant's audio playback when the user interrupts. Instead of waiting to receive a `turn.strat` event (which indicates a turn change), send a `trigger.audio.replay_finished` event when the user interrupts the assistant. Start user turn (user has pressed the button): ```json theme={null} { "type": "trigger.turn.start", "role": "user" } ``` End user turn (user has released the button): ```json theme={null} { "type": "trigger.turn.end", "role": "user" } ``` ### Send Text Messages (Optional) To enable your users to sent text messages (as an alternative to voice), send a text user message from your frontend in the `client.response.text` event. Layercode will send the user text message to your agent backend in the same format as a regular user transcript message. ```json theme={null} { "type": "client.response.text", "content": "Text input from the user" } ``` * `content`: The full user message. Empty or whitespace-only payloads are ignored. ## Server → Client Messages The client will receive the following events from Layercode: ### Turn Management When the server detects the start of the user's turn: ```json theme={null} { "type": "turn.start", "role": "user", "turn_id": "UUID of user turn" } ``` When it's the assistant's turn: ```json theme={null} { "type": "turn.start", "role": "assistant", "turn_id": "UUID of assistant turn" } ``` ### Audio Response The client will receive audio chunks of the assistant's response, which should be buffered and played immediately. The content will be audio in the following format: * Base64 encoded * 16-bit PCM audio data * 16000 Hz sample rate * Mono channel See the [Vanilla JS SDK code](https://github.com/layercodedev/packages-and-docs/tree/main/packages/layercode-js-sdk/src) for an example of how to play the audio chunks. ```json theme={null} { "type": "response.audio", "content": "base64audio", "delta_id": "UUID unique to each delta msg", "turn_id": "UUID of assistant response turn" } ``` ### Text Response The client will receive text chunks of the assistant's response for display or processing: ```json theme={null} { "type": "response.text", "content": "Text content from assistant", "turn_id": "UUID of assistant response turn" } ``` ### User Transcript Updates Layercode streams back transcription updates for the user's speech so you can render the live transcript in your UI. #### Interim Transcript Delta Interim updates refine the current transcript in place as the speech recognizer gains confidence. Each `user.transcript.interim_delta` replaces the previous one (with a matching delta\_counter) until a `user.transcript.delta` arrives (with a matching delta\_counter). Subsequent `user.transcript.interim_delta` will have an incremented delta\_counter and should now be appended to the previous finalized `user.transcript.delta` text. ```json theme={null} { "type": "user.transcript.interim_delta", "content": "Partial user text", "turn_id": "user-UUID of the speaking turn", "delta_counter": 6 } ``` * `content`: Latest partial text heard for the in-progress user utterance. * `turn_id`: The user turn identifier (prefixed with the role for clarity). * `delta_counter`: Monotonic counter forwarded from the underlying transcription `delta.counter` to help you discard out-of-order updates. #### Transcript Delta Once the recognizer finalizes a span of text, it is emitted as a `user.transcript.delta`. Any subsequent `user.transcript.interim_delta` start a new span until the next finalized delta arrives. ```json theme={null} { "type": "user.transcript.delta", "content": "Stabilized transcript segment", "turn_id": "user-UUID of the speaking turn", "delta_counter": 6 } ``` * `content`: Stabilized transcript segment that should replace the previous interim text. * `turn_id`: The user turn identifier (prefixed with the role for clarity). * `delta_counter`: Monotonic counter forwarded from the underlying transcription `delta.counter` so you can detect missed or out-of-order deltas. \#### Final Transcript One the user's turn has deemed completed a final transcript is emitted. This contains the full text of the user's turn. ```json theme={null} { "type": "user.transcript", "content": "Complete transcript of user turn", "turn_id": "user-UUID of the speaking turn" } ``` ### Data and State Updates Your Webhook can return response.data SSE events, which will be forwarded to the browser client. This is ideal for updating UI and state in the browser. If you want to pass text or json deltas instead of full objects, you can simply pass a json object like `{ "delta": "text delta..." }` and accumulate and render the delta in the client browser. ```json theme={null} { "type": "response.data", "content": { "json": "object" }, "turn_id": "UUID of assistant response" } ``` # Introduction Source: https://docs.layercode.com/api-reference/introduction Layercode API Reference * **[Frontend WebSocket API](/api-reference/frontend-ws-api) (for building web and mobile voice AI applications):** Enables seamless connection between your frontend applications and Layercode's real-time agents. Use this API with our Frontend SDKs to stream audio and receive responses. * **[Webhook SSE API](/api-reference/webhook-sse-api) (for connecting your own backend to Layercode):** This is a webhook endpoint you implement in your backend, to receive transcriptions from the user, then respond with SSE messages containing text to be converted to speech and spoken to the user. # REST API Source: https://docs.layercode.com/api-reference/rest-api API reference for the Layercode REST API. ## Authorize Client Session To connect a client (browser or mobile app) to a Layercode voice agent, you must first authorize the session. This is done by calling the Layercode REST API endpoint below from your backend. **How the authorization flow works:** When using a Layercode frontend SDK (such as `@layercode/react-sdk` or `@layercode/js-sdk`), the SDK will automatically make a POST request to the `authorizeSessionEndpoint` URL that you specify in your frontend code. This `authorizeSessionEndpoint` should be an endpoint on **your own backend** (not Layercode's). Your backend receives this request from the frontend, then securely calls the Layercode REST API (`https://api.layercode.com/v1/agents/web/authorize_session`) using your `LAYERCODE_API_KEY`. Your backend then returns the `client_session_key` to the frontend. Scheduled change: Monday 1 September at 12:00 UTC — the response body will return conversation\_id instead of session\_id. Until then, you will continue to receive session\_id. Plan your upgrade accordingly. Your Layercode API key should never be exposed to the frontend. Always call this endpoint from your backend, then return the client\_session\_key to your frontend. ### Endpoint ```http theme={null} POST https://api.layercode.com/v1/agents/web/authorize_session ``` ### Headers Bearer token using your LAYERCODE\_API\_KEY. Must be application/json. ### Request Body The ID of the Layercode agent the client should connect to. (Optional) The conversation ID to resume an existing conversation. If not provided, a new conversation will be created. ### Response The key your frontend uses to connect to the Layercode WebSocket API. The unique conversation ID. Optional configuration for this session used by the frontend SDK. When present, it can include:
transcription.trigger, transcription.automatic, transcription.can\_interrupt, and VAD settings such as vad.enabled, vad.gate\_audio, vad.buffer\_frames, vad.model, vad.positive\_speech\_threshold, vad.negative\_speech\_threshold, vad.redemption\_frames, vad.min\_speech\_frames, vad.pre\_speech\_pad\_frames, vad.frame\_samples.
### Example Request ```bash theme={null} # Example with only agent_id (creates a new session) curl -X POST https://api.layercode.com/v1/agents/web/authorize_session \ -H "Authorization: Bearer $LAYERCODE_API_KEY" \ -H "Content-Type: application/json" \ -d '{"agent_id": "ag-123456"}' # Example with agent_id and conversation_id (resumes an existing conversation) curl -X POST https://api.layercode.com/v1/agents/web/authorize_session \ -H "Authorization: Bearer $LAYERCODE_API_KEY" \ -H "Content-Type: application/json" \ -d '{"agent_id": "ag-123456", "conversation_id": "lc_conv_abc123..."}' ``` ### Example Response ```json theme={null} { "client_session_key": "lc_sesskey_abc123...", "conversation_id": "lc_conv_abc123..." } ``` ### Error Responses Error message describing the problem. **Possible error cases:** * `400` – Invalid or missing bearer token, invalid agent ID, missing or invalid conversation ID. * `402` – Insufficient balance for the organization. **Example error response:** ```json theme={null} { "error": "insufficient balance" } ``` ### Example: Backend Endpoint (Next.js) Here's how you might implement an authorization endpoint in your backend (Next.js example): ```ts Next.js app/api/authorize/route.ts [expandable] theme={null} export const dynamic = "force-dynamic"; import { NextResponse } from "next/server"; export const POST = async (request: Request) => { // Here you could do any user authorization checks you need for your app const endpoint = "https://api.layercode.com/v1/agents/web/authorize_session"; const apiKey = process.env.LAYERCODE_API_KEY; if (!apiKey) { throw new Error("LAYERCODE_API_KEY is not set."); } const requestBody = await request.json(); if (!requestBody || !requestBody.agent_id) { throw new Error("Missing agent_id in request body."); } try { const response = await fetch(endpoint, { method: "POST", headers: { "Content-Type": "application/json", Authorization: `Bearer ${apiKey}`, }, body: JSON.stringify(requestBody), }); if (!response.ok) { const text = await response.text(); throw new Error(text || response.statusText); } return NextResponse.json(await response.json()); } catch (error: any) { console.log("Layercode authorize session response error:", error.message); return NextResponse.json({ error: error.message }, { status: 500 }); } }; ``` For other backend frameworks (Express, FastAPI, etc.), the logic is the same: receive a request from your frontend, call the Layercode authorize\_session endpoint with your API key, and return the client\_session\_key to your frontend. ## Agents ### List Agents ```http theme={null} GET https://api.layercode.com/v1/agents ``` Bearer token using your LAYERCODE\_API\_KEY. #### Response Returns all agents. Each agent object includes id, name, type, agent\_template\_id, created\_at, updated\_at, and assigned\_phone\_numbers (array of phone number assignments with phone\_number, twilio\_sid, friendly\_name, assigned\_at). #### Example ```bash theme={null} curl -H "Authorization: Bearer $LAYERCODE_API_KEY" \ https://api.layercode.com/v1/agents ``` ```json theme={null} { "agents": [ { "id": "ag-123456", "name": "My Agent ag-123456", "type": "voice", "agent_template_id": "tmpl_default", "created_at": "2024-04-01T12:00:00.000Z", "updated_at": "2024-04-08T16:30:16.000Z", "assigned_phone_numbers": [ { "phone_number": "+15551234567", "twilio_sid": "PNxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx", "friendly_name": "Support Line", "assigned_at": "2024-04-02T09:21:00.000Z" } ] } ] } ``` ### Create Agent From Template ```http theme={null} POST https://api.layercode.com/v1/agents ``` Bearer token using your LAYERCODE\_API\_KEY. Must be application/json. Optional template ID to initialize the agent configuration. If omitted, the default recommended template is used. #### Response Returns the newly created agent record, including configuration and webhook secret. Unique identifier for the agent. Human-friendly name assigned by Layercode. Agent type (currently voice). Full pipeline configuration cloned from the template. Secret used to validate incoming webhooks. ID of the template used to create the agent. ```bash theme={null} curl -X POST https://api.layercode.com/v1/agents \ -H "Authorization: Bearer $LAYERCODE_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "template_id": "tmpl_sales" }' ``` ### Get Agent Details ```http theme={null} GET https://api.layercode.com/v1/agents/{agent_id} ``` Bearer token using your LAYERCODE\_API\_KEY. The ID of the agent. #### Response Returns the agent. Agent ID. Agent display name. Current pipeline configuration. Array of phone number assignments for this agent. ```bash theme={null} curl -H "Authorization: Bearer $LAYERCODE_API_KEY" \ https://api.layercode.com/v1/agents/ag-123456 ``` ### Update Agent Configuration ```http theme={null} POST https://api.layercode.com/v1/agents/{agent_id} ``` Bearer token using your LAYERCODE\_API\_KEY. Must be application/json. The ID of the agent to update. URL for production webhooks. When provided, demo\_mode is automatically disabled. #### Response Returns the updated agent record with the new configuration. ```bash theme={null} curl -X POST https://api.layercode.com/v1/agents/ag-123456 \ -H "Authorization: Bearer $LAYERCODE_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "webhook_url": "https://example.com/layercode-webhook" }' ``` ## Sessions ### Get Session Details ```http theme={null} GET https://api.layercode.com/v1/agents/{agent_id}/sessions/{session_id} ``` Bearer token using your LAYERCODE\_API\_KEY. The ID of the agent. The connection ID for the session. This is the unique connection identifier for a given session. #### Response Returns JSON with details about the session, transcript, and recording status. Connection ID for the session. ID of the agent. ISO timestamp when the connection started. ISO timestamp when the connection ended (if ended). Total connection duration in milliseconds. Custom metadata associated with the session. Caller phone number (Twilio), if applicable. Caller country code (Twilio), if applicable. Agent phone number (Twilio), if applicable. Agent phone number country code (Twilio), if applicable. IP address of the connection. Country code derived from IP address when available. Total seconds of user speech. Total seconds of generated speech. Processing latency in milliseconds. Array of transcript entries. Each entry includes: timestamp, user\_message, assistant\_message, latency\_ms. One of not\_available, in\_progress, completed. If recording\_status is completed, a URL to download the WAV recording for this session connection. #### Example ```bash theme={null} curl -H "Authorization: Bearer $LAYERCODE_API_KEY" \ https://api.layercode.com/v1/agents/ag-123456/sessions/lc_conn_abc123 ``` ### Download Session Recording ```http theme={null} GET https://api.layercode.com/v1/agents/{agent_id}/sessions/{session_id}/recording ``` Bearer token using your LAYERCODE\_API\_KEY. The ID of the agent. The connection ID for the session. Returns a WAV audio file if available. ```bash theme={null} curl -L -H "Authorization: Bearer $LAYERCODE_API_KEY" \ -o session.wav \ https://api.layercode.com/v1/agents/ag-123456/sessions/lc_conn_abc123/recording ``` Recordings are generated after a session completes. If a recording is still processing, the details endpoint will return recording\_status: "in\_progress". Once your frontend receives the client\_session\_key, it can connect to the Layercode WebSocket API to start streaming audio. ## Calls ### Initiate Outbound Call ```http theme={null} POST https://api.layercode.com/v1/agents/ag-123456/calls/initiate_outbound ``` The phone number assigned to your Layercode Agent that will make the call. Remember: the from\_phone\_number must be a number already assigned to your Laycode Agent in the dashboard. The phone number to call (e.g., your mobile number for testing). #### Response The unique conversation ID. A Session (associated with the returned conversation\_id) will be created shortly after once Twilio initiates the call) #### Example Request ```bash theme={null} curl -X POST https://api.layercode.com/v1/agents/ag-123456/calls/initiate_outbound \ -H 'Authorization: Bearer $LAYERCODE_API_KEY' \ -H 'Content-Type: application/json' \ -D '{ "from_phone_number": "NUMBER_ASSIGNED_TO_YOUR_AGENT", "to_phone_number": "PHONE_NUMBER_TO_CALL" }' ``` #### Example Response ```json theme={null} { "conversation_id": "lc_conv_abc123..." } ``` #### Error Responses Error message describing the problem. **Possible error cases:** * `400` – Invalid or missing bearer token, missing or request body, invalid from\_phone\_number (i.e. not assigned to the agent specified in the url). * `429` – Account session concurrency limit reached. * `402` – Insufficient balance for the organization. ## Twilio Voice ### TwiML Webhook Use this endpoint as the Voice webhook in your Twilio phone number configuration. Layercode validates the incoming request, authorizes a session, and returns TwiML that connects the call to your agent's WebSocket stream. ```http theme={null} POST https://api.layercode.com/v1/agents/twilio/twiml ``` Signature supplied by Twilio for request verification. Required when you have stored Twilio credentials in Layercode. Call direction reported by Twilio (e.g., inbound or outbound-api). Caller phone number. Caller country code supplied by Twilio. Phone number assigned to your agent. Destination country code supplied by Twilio. #### Response Returns TwiML that streams the call to the Layercode Twilio WebSocket endpoint. ```xml theme={null} ``` The response Streaming URL is generated dynamically for each request. Do not cache or reuse the client session key. # Webhook SSE API Source: https://docs.layercode.com/api-reference/webhook-sse-api Webhook SSE API ## Webhook Request Payload Layercode sends different webhook event types to your backend. Each request body is JSON. All requests include: * `type` (string): One of `message`, `session.start`, `session.end`, `session.update`. * `session_id` (string): Connection identifier for this session. Changes each reconnect. * `conversation_id` (string): Stable conversation identifier. Additional fields vary by event type, as described below. *** ### **message** * `text` (string): Transcribed user text. * `session_id` (string): A unique identifier for the current session. * `conversation_id` (string): A unique identifier for the conversation. * `turn_id` (string): Unique ID for this turn. * `from_phone_number` (string, optional): Caller phone number if Twilio is used. * `to_phone_number` (string, optional): Agent phone number if Twilio is used. **Example:** ```json theme={null} { "type": "message", "session_id": "sess_abc123", "conversation_id": "conv_xyz789", "turn_id": "turn_xyz123", "text": "Hello, how are you?", "from_phone_number": "+14155550123", "to_phone_number": "+14155559876" } ``` *** ### **session.start** Sent when a new session begins and your agent should optionally speak first. * `session_id` (string): A unique identifier for the current session. * `conversation_id` (string): A unique identifier for the conversation. * `turn_id` (string): Unique ID for the assistant welcome turn. * `from_phone_number` (string, optional): Caller phone number if Twilio is used. * `to_phone_number` (string, optional): Agent phone number if Twilio is used. **Example:** ```json theme={null} { "type": "session.start", "session_id": "sess_abc123", "conversation_id": "conv_xyz789", "turn_id": "turn_welcome_123", "from_phone_number": "+14155550123", "to_phone_number": "+14155559876" } ``` *** ### **session.update** Sent when asynchronous session data becomes available (e.g., after a recording completes). * `session_id` (string): A unique identifier for the current session. * `conversation_id` (string): A unique identifier for the conversation. * `recording_status` (string): `completed` or `failed`. * `recording_url` (string, optional): API URL to download WAV when `completed`. * `recording_duration` (number, optional): Duration in seconds. * `error_message` (string, optional): Error details when `failed`. * `metadata` (object): Session metadata originally provided during authorization (if any). * `from_phone_number` (string, optional): Caller phone number if Twilio is used. * `to_phone_number` (string, optional): Agent phone number if Twilio is used. **Example:** ```json theme={null} { "type": "session.update", "session_id": "sess_abc123", "conversation_id": "conv_xyz789", "from_phone_number": "+14155550123", "to_phone_number": "+14155559876", "recording_status": "completed", "recording_url": "https://api.layercode.com/v1/agents/ag_123/sessions/sess_abc123/recording", "recording_duration": 42.3, "metadata": { "userId": "u_123" } } ``` *** ### **session.end** Sent when the session finishes. Includes transcript and usage metrics. * `session_id` (string): A unique identifier for the current session. * `conversation_id` (string): A unique identifier for the conversation. * `agent_id` (string): Agent ID. * `started_at` / `ended_at` (string): ISO timestamps. * `duration` (number|null): Total milliseconds (if available). * `transcription_duration_seconds` (number|null) * `tts_duration_seconds` (number|null) * `latency` (number|null) * `ip_address` (string|null) * `country_code` (string|null) * `recording_status` (string): `enabled` or `disabled` (org setting for session recording). * `transcript` (array): Items of `{ role: 'user' | 'assistant', text: string, timestamp: number }`. * `from_phone_number` (string, optional): Caller phone number if Twilio is used. * `to_phone_number` (string, optional): Agent phone number if Twilio is used. **Example:** ```json theme={null} { "type": "session.end", "session_id": "sess_abc123", "conversation_id": "conv_xyz789", "agent_id": "ag_123", "from_phone_number": "+14155550123", "to_phone_number": "+14155559876", "started_at": "2025-08-28T10:00:00.000Z", "ended_at": "2025-08-28T10:03:00.000Z", "duration": 180000, "transcription_duration_seconds": 20.1, "tts_duration_seconds": 19.8, "latency": 120, "ip_address": "203.0.113.10", "country_code": "US", "recording_status": "enabled", "transcript": [ { "role": "user", "text": "Hello", "timestamp": 1724848800000 }, { "role": "assistant", "text": "Hi there!", "timestamp": 1724848805000 } ] } ``` # Setting up AGENTS.md and CLAUDE.md Source: https://docs.layercode.com/explanations/agents-md How to set up an AGENTS.md and CLAUDE.md for working with Layercode When working with LLMs in development with Layercode, we recommend creating an [AGENTS.md](https://agents.md/) (for most agents) and/or a [CLAUDE.md](https://www.anthropic.com/engineering/claude-code-best-practices) for Claude Code. The easiest way to add Layercode is to copy our whole docs into the file. You can find [every line of our docs in markdown here](https://docs.layercode.com/llms-full.txt). Or [links to our pages here](https://docs.layercode.com/llms.txt). # Configure your voice agents Source: https://docs.layercode.com/explanations/configuring-voice-agents Key concepts and options for configuring transcription, TTS, and backend in a Layercode agent. Use this page to choose transcription, text-to-speech (TTS), and backend settings for your agent. ## Transcription Transcription converts user speech to text. * Provider and model: match your language and latency needs. * Turn taking: automatic or push to talk. See [Turn taking](/explanations/turn-taking). * Interrupts (automatic mode): let users speak over the agent. ## Text-to-Speech (TTS) TTS converts the agent's text response to audio. * Provider and model: balance speed and quality. * Voice: choose one that fits your brand and language. ## Practical tips * Start with defaults and optimize after you have an end-to-end demo. * Prefer low-latency models for real-time conversations. * If using your own backend, test locally with a tunnel. See [Tunnelling](/how-tos/tunnelling). ## Where to change settings In the dashboard, open your agent and click **Edit** on Transcription, Text-to-Speech, or Backend. Changes apply immediately to new turns. # Connect Your Backend Source: https://docs.layercode.com/explanations/connect-backend How to connect your own agent backend to a Layercode agent. Layercode is designed for maximum flexibility: you can connect any backend that can receive an HTTP request and return a Server-Sent Events (SSE) stream. This allows you to use your own LLM-powered agent, business logic, or orchestration—while Layercode handles all the real-time voice infrastructure. ## How it works To use your own backend, click the "Connect Your Backend" button on your agent, and then set the **Webhook URL** to point to your backend's endpoint. Connect Backend When a user interacts with your voice agent, Layercode will: 1. Transcribe the user's speech to text. 2. Send an HTTP POST request to your backend at the Webhook URL you provide. 3. Your backend responds with a Server-Sent Events (SSE) stream containing the agent's reply (text to be spoken, and optional data). 4. Layercode handles converting the text in your response to speech and streaming it back to the user in real time. 5. Return of JSON data is also supported to allow you to pass state back to your UI. Layercode Diagram ## Configuring Your Agent 1. In the Layercode dashboard, open your agent and click **Connect Your Backend** (or click the edit button in the Your Backend box if you've already connected your backend previously). 2. Enter your backend's **Webhook URL** in the configuration modal. 3. Optionally, configure which webhook events you want to receive (see below). 4. Save your changes. ## Webhook Events * **message** (required):\ Sent when the user finishes speaking. Contains the transcribed message and metadata. Your backend should respond with an SSE stream containing the agent's reply. * **session.start** (optional):\ Sent when a new session is started (e.g., when a user connects). Use this to have your agent start the conversation. If disabled, the agent will wait for the user to speak first when a new session is started. ## Webhook Verification To ensure the security of your backend, it's crucial to verify that incoming requests are indeed from Layercode. This can be done by verifying the `layercode-signature` header, which contains a timestamp and a HMAC-SHA256 signature of the request body. Here's how you can verify the signature in your backend: 1. Retrieve the `layercode-signature` header from the request. It will be in the format: `t=timestamp,v1=signature`. 2. Get your Layercode webhook secret from the Layercode dashboard (found by going to the appropriate agent and clicking the edit button in the Your Backend box, where you'll find the Webhook Secret). 3. Reconstruct the signed payload by concatenating the timestamp, a period (`.`), and the exact raw webhook request body: `signed_payload = timestamp + "." + request_body`. 4. Compute the HMAC-SHA256 signature of this signed payload using your webhook secret. 5. Compare the computed signature with the `v1` value from the `layercode-signature` header. If they match, the request is valid. 6. (Recommended) Check that the timestamp is recent (for example, within 5 minutes) to prevent replay attacks. ## Example: Webhook Request When a user finishes speaking, Layercode will send a POST request to your webhook with the following JSON payload body: ```json theme={null} { "type": "message", // The type of webhook event: message or session.start "session_id": "uuid", // Session ID is unique per conversation. Use this to know which conversation a webhook belongs to. "turn_id": "uuid", // Turn ID is unique per turn of the conversation. This ID must be returned in all SSE events. It is unique per turn of the conversation. "text": "What's the weather today?" // The user's transcribed message } ``` See the [Webhook SSE API documentation](/api-reference/webhook-sse-api) for details ## Example: SSE Response Your backend should respond with an SSE stream. Each SSE message contains a JSON payload with the following fields: `type`, `content` (when required) and `turn_id`. See the [Webhook SSE API documentation](/api-reference/webhook-sse-api) for details. # Keeping track of conversation history Source: https://docs.layercode.com/explanations/conversation-history How to persist turn-by-turn context when webhook requests can abort Tracking conversation history seems easy. But there is one big gotcha - webhook requests can abort. And it's common in voice because of interruptions. And so we need to adjust our approach. Let's start naively. A user sends a message, so we add it to an array. ```json theme={null} [ { "role": "user", "turn_id": "turn-1", "content": "Hey, how do I make a hot dog?" } ] ``` And then when the assistant replies, we simply append it: ```json theme={null} [ { "role": "user", "turn_id": "turn-1", "content": "Hey, how do I make a hot dog?" }, { "role": "assistant", "turn_id": "turn-2", "content": "You put the frankfurter in the bun and add some mustard." } ] ``` ### But what if the user interrupts? When the user interrupts mid-response, the **webhook request that was generating the assistant’s reply is abruptly terminated**.\ Unless we’ve already written something to memory, the assistant’s partial message could be lost. In practice, this happens a lot with voice agents — users cut off the model to ask something new before the previous response finishes.\ If we don’t handle this carefully, our in-memory state drifts out of sync with what actually happened in the conversation. And you might not even realize, and think the LLM is just being a silly billy. *** ## So what do I need to do? When a new user webhook arrives, persist in this order: 1. **Store the user message** right away so the turn is anchored in history. 2. **Insert the assistant placeholder** before you start streaming tokens back. ```ts theme={null} conversationMessages[conversation_id].push({ role: "user", turn_id, content: userInput }); ``` ```ts theme={null} conversationMessages[conversation_id].push({ role: "assistant", turn_id, content: "" // placeholder }); ``` If the webhook completes successfully: * Remove the placeholder and append final messages with the same `turn_id`. If the webhook is aborted: * The placeholder remains, capturing the interrupted turn. * The next `message` webhook includes `interruption_context`, which tells us which `assistant_turn_id` was cut off. * You can reconcile by marking that entry as interrupted. ### Example Interruption Handling ```ts theme={null} if (interruption_context?.assistant_turn_id) { const prev = conversationMessages[conversation_id]; const interrupted = prev.find( (m) => m.role === "assistant" && m.turn_id === interruption_context.assistant_turn_id ); if (interrupted) { interrupted.content += " [interrupted]"; } } ``` This ensures that when the next user turn arrives, the model still sees every turn — even those that were cut off. *** ### Why doesn't the assistant finish the turn? When a user interrupts, Layercode immediately cancels the webhook request that was streaming the assistant response.\ Because the request terminates, your worker never has a chance to finalize the response or append it to history.\ There is currently no back-channel for Layercode to notify your backend gracefully — cancelling the request is the only interruption signal we can provide. This is why persisting the placeholder before you stream tokens is essential. ### Do I get an `AbortSignal`? Layercode does not propagate a custom `AbortSignal` into your AI SDK calls.\ Instead, the framework relies on the platform aborting the request (Cloudflare Workers receive the native `ExecutionContext` cancellation). Make sure any long-running model or fetch calls can tolerate the request being torn down mid-stream; the placeholder you stored lets you recover once the next webhook arrives. ### What about multiple interruptions in a row? Even if a user interrupts several turns back-to-back, Layercode only sends `interruption_context` for the immediately previous assistant turn.\ Persist that context as soon as the new webhook starts (before any expensive work) so it survives if another interruption happens quickly afterward. The placeholder pattern above keeps your transcript accurate even during rapid-fire interrupts. *** ## Stored Message Shape and `turn_id` Every stored message (user and assistant) includes a `turn_id` corresponding to the webhook event that created it: ```ts theme={null} { role: 'user', turn_id: , content: '...' } { role: 'assistant', turn_id: , content: '...' } ``` The initial system message does **not** have a `turn_id`. *** ## Persistence Notes * There is no deduplication or idempotency handling yet in Layercode. So you will need to write logic to filter this. *** ## TL;DR ✅ Always store user messages immediately.\ ✅ Add a placeholder assistant message before streaming.\ ✅ Replace or mark the placeholder when the turn finishes or is interrupted.\ ✅ Never rely on the webhook completing — it might abort anytime.\ ✅ Keep `turn_id` and `conversation_id` consistent for reconciliation. # How connecting to Layercode works Source: https://docs.layercode.com/explanations/how-connect-works Visual diagram of how your app connects to Layercode ## Fresh Page Load (New Conversation) ```mermaid theme={null} sequenceDiagram participant UI as Browser UI participant SDK as LayercodeClient (JS SDK) participant Auth as POST /v1/agents/web/authorize_session participant DB (conversations) participant WS as GET /v1/agents/web/websocket participant Pipeline as Voice Pipeline Worker UI->>SDK: instantiate client.connect() SDK->>Auth: POST { agent_id, metadata, sdk_version } Auth->>DB: validate pipeline/org and insert conversation DB-->>Auth: client_session_key + conversation_id Auth-->>SDK: { client_session_key, conversation_id, config } SDK->>WS: WebSocket upgrade ?client_session_key=... WS->>DB: lookup conversation via client_session_key WS->>Pipeline: start voicePipeline(session) Pipeline-->>SDK: streaming audio + events SDK-->>UI: onConnect({ conversationId }) ``` * `authorizeSession` creates the conversation record when no `conversation_id` exists and returns a 1-hour `client_session_key`. * The browser client must include a valid bearer token (API key) when proxying to the authorize endpoint. *** ## Page Load With Stored Conversation ```mermaid theme={null} sequenceDiagram participant UI as Browser UI (resuming) participant SDK as LayercodeClient participant Auth as POST /v1/agents/web/authorize_session participant DB participant WS as GET /v1/agents/web/websocket participant Pipeline as Voice Pipeline Worker UI->>SDK: client.connect() SDK->>Auth: POST { agent_id, conversation_id } Auth->>DB: fetch conversation and pipeline DB-->>Auth: verify ownership, update session key expiry Auth-->>SDK: { client_session_key, conversation_id, config } SDK->>WS: WebSocket upgrade using new client_session_key WS->>DB: validate conversation + pipeline balance WS->>Pipeline: resume conversation context Pipeline-->>SDK: stream resumes with existing turn state ``` * The SDK automatically reconnects to an existing conversation if a `conversationId` is cached. * To start fresh, create a new client with `conversationId = null`. * Re-authorizing rotates the `client_session_key`, so old WebSocket URLs stop working once a resume happens. *** ## Network Drop and Manual Reconnect ```mermaid theme={null} sequenceDiagram participant UI as Browser UI participant SDK as LayercodeClient participant WS as WebSocket Connection participant Auth as POST /v1/agents/web/authorize_session participant DB participant Pipeline as Voice Pipeline Worker WS-xSDK: network drop / close event SDK->>SDK: _performDisconnectCleanup() (status=disconnected) SDK-->>UI: onDisconnect() (show reconnect) UI->>SDK: user clicks reconnect SDK->>Auth: POST { agent_id, conversation_id } Auth->>DB: update client_session_key, ensure balance Auth-->>SDK: { client_session_key, conversation_id } SDK->>WS: establish new WebSocket ?client_session_key=... WS->>Pipeline: restart transport against same conversation Pipeline-->>SDK: continue streaming and emit onConnect ``` * Device listeners, VAD, and amplitude monitors are rebuilt on reconnect. * The cached `conversationId` persists, so the next `authorize` call resumes seamlessly. * To force a fresh run after a drop, instantiate a new client with `conversationId = null` before reconnecting. # How Layercode works Source: https://docs.layercode.com/explanations/how-layercode-works The fastest way to add production-ready, low-latency voice to your AI agents. Layercode Diagram Our cloud platform powers the real-time infrastructure required to deliver responsive, engaging voice interfaces—so you can focus on building exceptional conversational experiences. ## Why Layercode? * **Low-latency, production-grade voice agents**\ Deliver natural, real-time conversations to your users, wherever they are. * **Full control, zero lock-in**\ Easily configure your agent, swap between leading voice model providers, and plug in your own agent backend with a single webhook. * **Build voice agents for the web, mobile or phone**\ Add voice into your web and mobile apps. Coming soon: handle incoming and outgoing calls with your voice agent. * **Powerful, flexible voice agents**\ Mix and match audio processing plugins, transcription, and text-to-speech models. Support for 32+ languages and 100+ voices. * **Global scale and reliability**\ Our network spans 330+ locations worldwide, ensuring every interaction is smooth, fast and reliable - wherever your users are. * **Transparent pricing and flexible billing** Only pay for what you use, per minute. No concurrency limits. Cartesia and ElevenLabs text-to-speech now run on your own API keys, while we continue to consolidate remaining managed provider usage into a single bill. ## What can you build? Layercode is built for developers who want to: * Add voice to LLM-powered agents and apps * Build custom, multi-lingual voice assistants * Support for web, mobile and phone (coming soon) voice agents * Integrate voice into customer support, sales, training, and more * Use the latest voice AI models - without vendor lock-in ## Ready to get started? [Create your first real-time voice agent →](../tutorials/getting-started) # Reducing latency with Layercode Source: https://docs.layercode.com/explanations/latency How to reduce latency with your voice ai agents. Reducing latency and - especially reducing time-to-first-token - is important for natural-sounding conversations. There are some things that we will always work hard on reducing (e.g. transporting your audio across the internet). But some latency is based on choices and trade-offs you can make. And there are some things that won't reduce latency directly but may reduce the feeling of latency. A lot of these are even more important if you are doing tool calls or letting agents run in loops - this could take a long time to complete. Here are some tips that could help you reduce latency (or perceived latency) with your voice agents: 1. **Pick a low-TTFT model.** We currently recommend Gemini flash-2.5-lite or OpenAI gpt-4o-mini because they deliver the quickest time-to-first-token. Avoid “thinking” or reasoning-extended variants unless you explicitly need them—they trade large amounts of latency for marginal quality gains in spoken conversations. 2. **Prime the user with speech before long work.** Inside a tool call, send a `response.tts` event such as “Let me look that up for you” before you start heavy processing. The SDK will surface it to the client as audio immediately, buying you time without leaving silence. See [the tool calling how-to](/how-tos/tool-calling-js#sending-speech-to-the-user-to-tell-them-a-call-is-happening) for an example. 3. **Keep users informed during long tool calls.** Emit a `response.data` message as soon as the work starts so the UI can surface a loader or status update—see [Sending data to the client](/how-tos/sending-data-to-client) and the API reference for [Data and state updates](/api-reference/frontend-ws-api#data-and-state-updates). You can also play a short “thinking” audio clip in the browser so the user hears that the agent is still busy. 4. **Be deliberate with RAG.** Running retrieval on every turn (especially in loops) adds network hops and can stall a conversation. Fetch external data through tool calls only when it’s needed, and narrate what the agent is doing so the user understands the delay. 5. **Reduce infrastructure round trips.** Store conversations in a fast, nearby database—Redis is a good default—and keep ancillary services in the same region as your Layercode deployment to avoid cross-region latency spikes. # Speech to text providers Source: https://docs.layercode.com/explanations/speech-to-text Transcription engines available in the Layercode pipeline. Layercode keeps speech recognition modular so you can match the right engine to each pipeline. Today we offer Deepgram's latest streaming stack as the single speech-to-text integration available in production, delivering the fastest and most accurate in-call transcription we support. ## Deepgram Nova-3 (primary streaming) * **Model**: `nova-3`, Deepgram's flagship speech-to-text model tuned for high-accuracy, low-latency conversational AI. * **Real-time features**: Smart formatting, interim hypotheses, and MIP opt-out are all enabled to optimize conversational turn taking out of the box. Nova-3's latency profile keeps responses within the sub-second expectations of interactive agents. * **Audio formats**: We normalize audio to 8 kHz linear PCM or μ-law depending on transport requirements, so Nova-3 receives the clean signal it expects in both browser and telephony scenarios. * **Connectivity**: Choose the managed Cloudflare route (provider `deepgram`) or connect directly to Deepgram (`deepgram_cloud`). We merge your configuration with our defaults and set the appropriate authorization headers for each mode. Add your Deepgram API key in **Settings → Providers** to unlock direct Deepgram access. If you do not supply a key, the pipeline uses our Cloudflare-managed path with the same Nova-3 model. Deepgram Nova-3 combines the accuracy improvements announced in their [Nova-3 launch](https://deepgram.com/learn/introducing-nova-3-speech-to-text-api) with Layercode's interruption handling to keep transcripts actionable in real time. # Text to speech providers Source: https://docs.layercode.com/explanations/text-to-speech How Layercode streams audio with Cartesia, ElevenLabs, and Rime. Layercode supports three real-time text to speech (TTS) integrations. Each runs inside the same low-latency pipeline, but the configuration, pricing, and recommended use cases differ. Rime is the only managed (non-BYOK) option; Cartesia and ElevenLabs require your own credentials. ## Cartesia (bring your own key) * **Model**: `sonic-2`, the model we configure in the Layercode pipeline. * **Voices**: Starts with the "Mia" preset (`1d3ba41a-96e6-44ad-aabb-9817c56caa68`), with support for any Cartesia voice ID. * **Audio formats**: Streams 16 kHz PCM by default and can downshift to 8 kHz μ-law for phone use. * **Timestamps**: Word-level timestamps are enabled automatically, making Cartesia ideal when you need precise interruption handling. Use Cartesia when you already manage a Cartesia account and want detailed timestamps with full access to Cartesia's voice library. Add your Cartesia API key on the **Settings → Providers** page to activate streaming; without a key we fall back to the managed Rime voice. ## ElevenLabs (bring your own key) * **Model**: `eleven_v2_5_flash`, the streaming model Layercode enables by default. * **Voices**: Defaults to the "Alloy" voice but accepts any ElevenLabs voice ID plus optional stability/similarity controls. * **Audio formats**: Streams 16 kHz PCM for the web and 8 kHz μ-law for telephony scenarios. * **Timestamps**: Character-level alignment is requested (`sync_alignment=true`) so you receive live timestamps for captions and interruptions. Choose ElevenLabs when you want to leverage your existing ElevenLabs voices or studio cloning features. Provide your ElevenLabs API key in **Settings → Providers**; pipelines without a key automatically move to the managed Rime voice. ## Rime (managed by Layercode) * **Model**: `mistv2`, the default managed voice inside Layercode. Mist v2 delivers unmatched accuracy, speed, and customization at scale—ideal for high-volume, business-critical conversations. * **Voices**: Ships with "Ana" out of the box, and we frequently use "Courtney" for fallbacks; any Rime speaker ID is supported to match the tone you need. * **Audio formats**: Streams PCM, MP3, or μ-law depending on your transport, so it works for the web and PSTN without extra conversion. * **Timestamps**: Provides streaming timestamps for accurate barge-in and captioning, helping you maintain fast turn taking. Rime is the easiest way to get started: Layercode manages the credentials, so it works immediately even if you have not supplied any third-party keys. Mist v2's precision voices help convert prospects, retain customers, and drive sales with messages that resonate, making it a strong default when you prefer consolidated billing. ## Picking the right provider * Start with **Rime** if you want instant setup with managed billing. * Switch to **Cartesia** when you own a Cartesia account and need high-fidelity voices with detailed timestamps. * Use **ElevenLabs** when you need ElevenLabs' cloned voices or multilingual catalog and can provide your own key. You can mix and match providers per pipeline, so experiment with different voices and formats to find the best fit for your experience. # Tool calling Source: https://docs.layercode.com/explanations/tool-calling How to set up tool calling with Layercode. Also known as function calling. Function calling is one of the first things you will want to do after setting up your agent. Because Layercode let's you work directly with text, you can use existing tools. There are many frameworks which can help you with function calling. ## TypeScript: * [ai SDK](https://ai-sdk.dev/docs/ai-sdk-core/tools-and-tool-calling) * [mastra](https://mastra.ai/en/examples/tools/calling-tools#from-an-agent) - see [example here](https://github.com/jackbridger?tab=repositories) ## Python: * [LlamaIndex](https://docs.llamaindex.ai/en/stable/module_guides/deploying/agents/tools/) * [LangChain](https://python.langchain.com/docs/concepts/tool_calling/) * [CrewAI](https://docs.crewai.com/en/concepts/tools) We have written a guide on [tool calling in Next.js with Layercode](/how-tos/tool-calling-js) # Turn Taking Source: https://docs.layercode.com/explanations/turn-taking Choosing the right turn taking strategy for your voice application is key to building a successful voice AI experience. Layercode supports multiple turn taking modes, so you can choose the best one for your use case. The best Turn Taking Mode to use depends on your voice application's use case and the environment your users are in. You may need to experiment with different modes to find the best fit for your application. ## Automatic Mode For most use cases, the default "Automatic" turn taking mode (with Can Interrupt enabled) is the best option to begin with. This will let users speak freely to the AI, and interrupt it at any time. But if your users are in a noisy environment you may find that this noise inadvertently interrupts the AI's response mid sentence. One solution to this is to disable Can Interrupt. In this case the user's response will only be listened to after the AI has finished speaking. The user will not be able to interrupt the AI mid sentence, and will always have to wait for the AI to finish. The downside of this approach is that users may become impatient if the AI's responses are long. ## Push to Talk Mode When building voice AI for the web or mobile, you can enable Push to Talk mode. This mode requires a small config change in your web or app frontend (we include this in all our demo apps). In this mode, the user must hold down a button to speak. When the user holds down the button, their speech is transcribed. When the user releases the button, the AI will respond. This mode is great for noisy environments, or situations where you want the user to have complete control over the conversation. # How webhooks work with Layercode Source: https://docs.layercode.com/explanations/webhooks How to receive events from Layercode Layercode delivers conversation updates to your backend through HTTPS webhooks. Each time a user joins, speaks, or finishes a session, the voice pipeline posts JSON to the webhook URL configured on your agent. In reply to this, your backend can stream text replies back with Server-Sent Events (SSE), and Layercode will use a text to speech model to return voice back to your user. We tell your backend - in text - what the user said. And your backend tells Layercode - in text - what to speak back to the user. ## Receiving requests from Layercode In order to receive and process messages from your users, you need a backend endpoint that Layercode can communicate with. For example, in Next.js it might look something like this: ```ts theme={null} export const dynamic = 'force-dynamic'; import { streamResponse, verifySignature } from '@layercode/node-server-sdk'; export const POST = async (request: Request) => { const requestBody = (await request.json()) as WebhookRequest; // Authorization goes here! (explained below) const { text: userText } = requestBody; console.log("user said: ", userText) // This is where all your LLM stuff can go to generate your response const aiResponse = "thank you for your message" // this would be dynamic in your application await stream.ttsTextStream(aiResponse); }; ``` *Note: authorization is below* ## Tell Layercode where your endpoint is Now you have an endpoint to receive messages from Layercode, you need to tell Layercode where to send your events. Go to Layercode's dashboard, create or use an existing agent. Go to manual setup and type in the API endpoint that Layercode should send requests to. Setting a webhook URL If your endpoint is just in your root, then you would use the url of your host. If it's in /voice-agent use your host/voice-agent. If you're using one of our [Next.js examples]('https://github.com/layercodedev/fullstack-nextjs-cloudflare/blob/main/app/api/agent/route.ts'), you will see the path to receive the requests from Layercode is /api/agent ### Expose your local endpoint with a tunnel If you're developing locally, you will need to run a tunnel such as cloudflared or ngrok and paste the tunnel URL into the dashboard (with the path of your endpoint in your app appended - for example *tunnel-url*/api/agent). Our [tunnelling guide](/how-tos/tunnelling) walks through the setup. ## Verify incoming requests You should make sure that only authorized requests are sent to this endpoint. To do this, we expose a secret that you can find in the same location you used above. You should save this secret with the other secrets in your backend and verify each incoming request to ```ts theme={null} export const dynamic = 'force-dynamic'; import { streamResponse, verifySignature } from '@layercode/node-server-sdk'; export const POST = async (request: Request) => { const requestBody = (await request.json()) as WebhookRequest; // Verify this webhook request is from Layercode const signature = request.headers.get('layercode-signature') || ''; const secret = process.env.LAYERCODE_WEBHOOK_SECRET || ''; const isValid = verifySignature({ payload: JSON.stringify(requestBody), signature, secret }); if (!isValid) return new Response('Invalid layercode-signature', { status: 401 }); const { text: userText } = requestBody; console.log("user said: ", userText) // This is where all your LLM stuff can go to generate your response const aiResponse = "thank you for your message" // this would be dynamic in your application await stream.ttsTextStream(aiResponse); }; ``` ## Customize which events you receive You can see details on the data that Layercode [sends to this endpoint here](/api-reference/webhook-sse-api) You can also toggle the events you want delivered: * `message` – (required) Fired after speech-to-text transcription completes for the user’s turn. * `session.start` – Sent as soon as a session opens so you can greet the user proactively. * `session.end` – Delivered when a session closes, including timing metrics and the full transcript. * `session.update` – Sent asynchronously once a session recording finishes processing (requires session recording to be enabled for the org). Webhook event types ## Respond to webhook events It's great to receive messages from users but of course you want to reply too. We can use a method on Layercode's stream object to reply `await stream.ttsTextStream("this is my reply");` # Deploy Next.js to Cloudflare Source: https://docs.layercode.com/how-tos/deploy-nextjs-to-cloudflare Some tips when deploying a Next.js voice agent to Cloudflare Layercode runs in our cloud, but you will need to deploy your Next.js application to provide your APIs and agent functionality (LLMs and tool calling). Plus if you are building for web, your Next.js acts as the client. This guide assumes you already have your Next.js application running locally with Layercode. If not, pleae follow our [getting started guide](/tutorials/getting-started) If you are using our Cloudflare getting-started project, you can simply run `npm run deploy` Otherwise, you should run ```bash theme={null} npm i @opennextjs/cloudflare ``` if it doesn't exist already, add a deploy script in your `package.json` ```json theme={null} "deploy": "opennextjs-cloudflare build && opennextjs-cloudflare deploy" ``` Then run ```bash theme={null} npm run deploy ``` You will be asked to create/connect a Cloudflare account if you don't already have one connected. note: you will need to use npm to deploy to Cloudflare because it expects a `package-lock.json` file\* You should see an ouput like this: ``` Total Upload: 5867.42 KiB / gzip: 1177.82 KiB Worker Startup Time: 25 ms Your Worker has access to the following bindings: Binding Resource env.ASSETS Assets Uploaded jolly-queen-84e7 (16.45 sec) Deployed jolly-queen-84e7 triggers (4.70 sec) https://jolly-queen-84e7.jacksbridger.workers.dev Current Version ID: 047446f6-055e-46b0-b67a-b45cb14fa8e8 ``` Take that url (e.g. [https://jolly-queen-84e7.jacksbridger.workers.dev](https://jolly-queen-84e7.jacksbridger.workers.dev)) of your backend and save it into the Layercode agent backend settings under webhook url (append the appropriate path for your API e.g. [https://jolly-queen-84e7.jacksbridger.workers.dev/api/agent](https://jolly-queen-84e7.jacksbridger.workers.dev/api/agent)) Then your application should run. But please reach out if you run into any issues. ## Setting up automated Cloudflare deployments You can use [Cloudflare Workers Builds](https://developers.cloudflare.com/workers/ci-cd/builds/) to deploy your application on GitHub commits. You connect your GitHub repository to your Worker by following [these steps](https://developers.cloudflare.com/workers/ci-cd/builds/git-integration/). In the Build settings: * The "Build command" should be set to `npx opennextjs-cloudflare build`. * The "Deploy command" should be set to `npx opennextjs-cloudflare deploy`. * The environment variables you previously set in `.env` **must** be copied and set in the "Build variables and secrets" section. This is so that `npm next build` executed by Workers Builds will have access to the environment variables. It needs that access to inline the NEXT\_PUBLIC\_... variables and access non-NEXT\_PUBLIC\_... variables needed for SSG pages. If you don't do this, you'll find the NEXT\_PUBLIC\_LAYERCODE\_AGENT\_ID env variable is missing and your voice agent won't work. Note: do not change your `package.json` build command. It should stay as `next build`. # Deploy Next.js to Vercel Source: https://docs.layercode.com/how-tos/deploy-nextjs-to-vercel Some tips when deploying a voice agent to Vercel Layercode runs in our cloud, but you will need to deploy your Next.js application to provide your APIs and agent functionality (LLMs and tool calling). Plus if you are building for web, your Next.js acts as the client. This guide assumes you already have your application running locally with Layercode. If not, pleae follow our [getting started guide](/tutorials/getting-started) To deploy to Vercel: 1. push your changes to a remote repo (i.e. GitHub/GitLab). 2. Sign up at Vercel, Click Add New project 3. Then import your Git Respository 4. Paste in your environmental variables from `.env` 5. Deploy 6. Take that url (e.g. [https://fullstack-nextjs-vercel-five.vercel.app/](https://fullstack-nextjs-vercel-five.vercel.app/)) of your backend and save it into the Layercode agent backend settings under webhook url (append the appropriate path for your API e.g. [https://fullstack-nextjs-vercel-five.vercel.app/api/agent](https://fullstack-nextjs-vercel-five.vercel.app/api/agent)) ### Troubleshooting authentication issues When deploying to Vercel, you MUST disable Vercel Authentication to allow Layercode webhooks to be received. By default for pro plans, Vercel blocks external requests to your application /api routes. This means that Layercode webhooks will not be received by your application, and your voice agent will not work. Disable Vercel Authentication by going to your project settings in the Vercel dashboard, then go to "Deployment Protection" in left sidebar menu, then turn off "Vercel Authentication" and Save. You do not need to redeploy. You can check your Webhook Logs in the Layercode dashboard to ensure that webhooks are being received successfully. If you receive a 405 error response to webhooks, this indicates that Vercel Authentication is still enabled. Note: if you're on a free tier, you may not need to make this change. Vercel authentication # Deploying to production Source: https://docs.layercode.com/how-tos/deploying Point Layercode to your production backend and manage environments Use this guide when moving from local development (tunnel Webhook URL) to a stable production deployment. ## Set your production Webhook URL In the Layercode dashboard: 1. Open the agent you want to be your production agent and click **Connect Your Backend** 2. Set your Webhook URL to your production endpoint, e.g. `https://your-domain.com/api/agent` 3. Save changes Use separate Layercode agents for production and for development or staging. Point each to its own backend URL. Keep your production Webhook URL stable and use staging agents for preview builds. ## Verify webhook signature in production Keep signature verification enabled in your `/api/agent` route. This protects your app from spoofed requests. ## Manage environments Store your agent IDs in environment variables and swap values per environment. For example: ```bash theme={null} # .env NEXT_PUBLIC_LAYERCODE_AGENT_ID=prod_agent_id ``` Use a different value in development or staging so each environment connects to the correct agent. # Connect to MCP servers with AI SDK Source: https://docs.layercode.com/how-tos/mcp-ai-sdk How to get your voice agents to use Model Context Protocol (MCP) tools with AI SDK and Layercode It can be useful for your voice agents to use [Model Context Protocol (MCP)](https://modelcontextprotocol.io) to fetch live data or perform external actions — for example, retrieving docs, querying databases, or running custom APIs. This guide shows you how to connect your **AI SDK** app to an **MCP server** and expose those tools to your **Layercode voice agent**. *** ## Prerequisites This guide assumes you already have **tool calling** set up and working with Layercode. If not, start here first:\ 👉 [Tool calling in Next.js with Layercode](https://docs.layercode.com/how-tos/tool-calling-js) Once that’s working, you can extend your agent with **MCP-based tools**. *** ## Example Setup > **Note:** The MCP URL `https://docs.layercode.com/mcp` below is just an example endpoint that connects to the **Layercode Docs MCP server**.\ > Replace this with your **own MCP server URL** — for example, one that connects to your company’s data, APIs, or private knowledge. ```ts theme={null} import { createOpenAI } from '@ai-sdk/openai'; import { streamText, stepCountIs, experimental_createMCPClient, tool } from 'ai'; import { streamResponse } from '@layercode/node-server-sdk'; import { StreamableHTTPClientTransport } from '@modelcontextprotocol/sdk/client/streamableHttp.js'; import z from 'zod'; export const POST = async (request: Request) => { const requestBody = await request.json(); const { conversation_id, text, turn_id } = requestBody; return streamResponse(requestBody, async ({ stream }) => { // ✅ Create a fresh MCP transport per request const transport = new StreamableHTTPClientTransport(new URL('https://docs.layercode.com/mcp')); const docsMCP = await experimental_createMCPClient({ transport }); try { const docsTools = await docsMCP.tools(); const weather = tool({ description: 'Get the weather in a location', inputSchema: z.object({ location: z.string().describe('The location to get the weather for') }), execute: async ({ location }) => ({ location, temperature: 72 + Math.floor(Math.random() * 21) - 10 }) }); const { textStream } = streamText({ model: openai('gpt-4o-mini'), system: 'You are a helpful assistant.', messages: [{ role: 'user', content: text }], tools: { weather, ...docsTools }, toolChoice: 'auto', stopWhen: stepCountIs(10), onFinish: async ({ response }) => { console.log('MCP Response Complete', response); stream.end(); } }); await stream.ttsTextStream(textStream); } finally { // ✅ Clean up the MCP connection await docsMCP.close(); } }); }; ``` # Outbound calls with Twilio Source: https://docs.layercode.com/how-tos/outbound-calls Using your Layercode Agent to make outbound phone calls You will need: * A Layercode Agent with an assigned Twilio phone number (see [Inbound calls with Twilio](/how-tos/setting-up-twilio)) This guide walks you through triggering an outbound phone call from your Layercode Agent. To trigger an outbound call, use the [`https://api.layercode.com/v1/agents/AGENT_ID/calls/initiate_outbound` endpoint](/api-reference/rest-api#initiate-outbound-call). You can call this endpoint from your backend whenever you want to initiate a call. You must have already set up your Layercode Agent to work with Twilio. If you haven't done that yet, see [Inbound calls with Twilio](/how-tos/setting-up-twilio). Goto REST API docs for **[more details about calling initiate\_outbound](/api-reference/rest-api#initiate-outbound-call)**. ### Example Request ```bash theme={null} curl -X POST https://api.layercode.com/v1/agents/ag-123456/calls/initiate_outbound \ -H 'Authorization: Bearer $LAYERCODE_API_KEY' \ -H 'Content-Type: application/json' \ -D '{ "from_phone_number": "NUMBER_ASSIGNED_TO_YOUR_AGENT", "to_phone_number": "PHONE_NUMBER_TO_CALL" }' ``` # How to write prompts for voice agents Source: https://docs.layercode.com/how-tos/prompting Some quick examples and tips for writing prompts for voice AI. Using the right system prompt is especially important when building Voice AI Agents. LLMs are primarily trained on written text, so they tend to produce output that is more formal and structured than natural speech. By carefully crafting your prompt, you can guide the model to generate responses that sound more conversational and human-like. # Base System Prompt for Voice AI ```text Minimal base prompt for Voice AI theme={null} You are a helpful conversation voice AI assistant. You are having a spoken conversation. Your responses will be read aloud by a text-to-speech system. You should respond to the user's message in a conversational manner that matches spoken word. Punctuation should still always be included. Never output markdown, emojis or special characters. Use contractions naturally. ``` # Pronunciation of numbers, dates & times Pronunciation of numbers, dates, times, and special characters is also crucial for voice applications. TTS (text-to-speech) providers handle pronunciations in different ways. A good base prompt that guides the LLM to use words to spell out numbers, dates, addresses etc will work for common cases. ```text Numbers & data rules theme={null} Convert the output text into a format suitable for text-to-speech. Ensure that numbers, symbols, and abbreviations are expanded for clarity when read aloud. Expand all abbreviations to their full spoken forms. Example input and output: "$42.50" → "forty-two dollars and fifty cents" "£1,001.32" → "one thousand and one pounds and thirty-two pence" "1234" → "one thousand two hundred thirty-four" "3.14" → "three point one four" "555-555-5555" → "five five five, five five five, five five five five" "2nd" → "second" "XIV" → "fourteen" - unless it's a title, then it's "the fourteenth" "3.5" → "three point five" "⅔" → "two-thirds" "Dr." → "Doctor" "Ave." → "Avenue" "St." → "Street" (but saints like "St. Patrick" should remain) "Ctrl + Z" → "control z" "100km" → "one hundred kilometers" "100%" → "one hundred percent" "elevenlabs.io/docs" → "eleven labs dot io slash docs" "2024-01-01" → "January first, two-thousand twenty-four" "123 Main St, Anytown, USA" → "one two three Main Street, Anytown, United States of America" "14:30" → "two thirty PM" "01/02/2023" → "January second, two-thousand twenty-three" or "the first of February, two-thousand twenty-three", depending on locale of the user ``` # Enable push-to-talk in React/Next.js Source: https://docs.layercode.com/how-tos/push-to-talk Configure push-to-talk turn taking with the Layercode React SDK. By default, Layercode agents use automatic turn taking. If you prefer explicit control—press and hold to speak—enable push-to-talk in your agent and wire up the callbacks in your UI. ## 1) Enable push-to-talk in the dashboard In your agent panel on [https://dash.layercode.com/](https://dash.layercode.com/) → Transcriber → Settings → set Turn Taking to Push to Talk → Save your changes. Select push to talk ## 2) Use the React SDK callbacks When using push-to-talk, call `triggerUserTurnStarted()` when the user begins speaking (pressing the button), and `triggerUserTurnFinished()` when they stop (releasing the button). ```tsx app/ui/VoiceAgentPushToTalk.tsx theme={null} 'use client'; import { useLayercodeAgent } from '@layercode/react-sdk'; export default function VoiceAgentPushToTalk() { const { status, triggerUserTurnStarted, triggerUserTurnFinished } = useLayercodeAgent({ agentId: process.env.NEXT_PUBLIC_LAYERCODE_AGENT_ID!, authorizeSessionEndpoint: '/api/authorize', }); return ( ); } ``` Turn taking is explained conceptually in our [Turn taking guide](/explanations/turn-taking). ## What gets sent when you press the button The React SDK continuously captures microphone audio into a short rolling buffer even while the button is idle. When you call `triggerUserTurnStarted()`, we immediately flush roughly one second of pre-roll audio along with anything you speak while the button stays down. This keeps the start of the utterance intact, so agents hear the full word instead of a clipped syllable. You can fine-tune the pre-roll length with the `vad.buffer_frames` agent setting. Each frame represents about 100 ms of audio, so lowering the value shortens the buffer and raising it adds more context before the press. # Send text messages from the client Source: https://docs.layercode.com/how-tos/send-text-messages Capture text input in your UI and hand it to a Layercode agent without streaming audio. Layercode agents normally consume live microphone audio, but some experiences need a text fallback—think chat bubbles, accessibility flows, or quick corrections while the mic is muted. The WebSocket API and SDKs expose `sendClientResponseText` for exactly that: send the full utterance as text, close the user turn, and let the agent reply immediately. This guide shows how to wire text messages in both Vanilla JS and React. ## 1) Vanilla JS example The `LayercodeClient` instance exposes `sendClientResponseText`. Add a simple form that forwards the entered text and clears the field when submitted. ```html send-text.html theme={null}
``` What happens when you call `sendClientResponseText`: * The current user turn is closed, even if no audio was streamed. * `user.transcript` event is emitted so your UI stays in sync. * The agent receives the text message through the regular webhook path and responds immediately. ## 2) React example The React SDK exposes the same capability via the `useLayercodeAgent` hook. Grab the `sendClientResponseText` method from the hook and call it from your form handler. ```tsx app/components/TextReplyForm.tsx theme={null} 'use client'; import { FormEvent } from 'react'; import { useLayercodeAgent } from '@layercode/react-sdk'; export function TextReplyForm() { const { status, sendClientResponseText } = useLayercodeAgent({ agentId: process.env.NEXT_PUBLIC_LAYERCODE_AGENT_ID!, authorizeSessionEndpoint: '/api/authorize', }); const handleSubmit = (event: FormEvent) => { event.preventDefault(); const form = event.currentTarget; const data = new FormData(form); const message = (data.get('message') as string).trim(); if (!message) return; sendClientResponseText(message); form.reset(); }; return (
); } ``` Disable the form while the client is still connecting so you do not cue messages before a session exists. # Sending data to your client from Layercode stream Source: https://docs.layercode.com/how-tos/sending-data-to-client How to send data to your client via the Layercode stream Sometimes you will want your Layercode stream to include additional data. For example, you might want to update the user that the LLM is thinking or looking something up. To do this, you can use the `stream.data` method. For example: ```ts theme={null} stream.data({ status: 'thinking' }) ``` And on the client side, you can receive the data you send: ```tsx theme={null} const { } = useLayercodeAgent({ agentId: "your-agent-id", authorizeSessionEndpoint: "/api/authorize", onDataMessage: (data) => console.log("Received data:", data), // {status: 'thinking'} }); ``` # Inbound calls with Twilio Source: https://docs.layercode.com/how-tos/setting-up-twilio Setting up a voice agent to receive phone calls for you You will need: * A Twilio account * A Twilio phone number (can be a trial number) * Your Twilio Account SID and Auth Token This guide walks you through configuring Layercode to answer calls to your Twilio phone number. If you'd like to trigger outbound calls from your Layercode Agent, see [Outbound calls with Twilio](/how-tos/outbound-calls). 1. Go to the Layercode dashboard at [https://dash.layercode.com](https://dash.layercode.com) and select your agent. 2. Open the client settings, enable Twilio phone calls, and then save changes. Edit client for Twilio 3. Go to the Layercode settings at [https://dash.layercode.com/settings](https://dash.layercode.com/settings). 4. Add your Twilio Account SID and Auth Token, then save. Save Twilio credentials Twilio recently changed where the Auth Token and Account SID are displayed. In the Twilio Console, use the search bar to find “Account SID” and “Auth Token”. 5. Return to your agent's client settings. You should now be able to select a Twilio phone number. If you don't see your number, refresh the page. Ensure the number is in the same Twilio account as the credentials you added. You can assign multiple Twilio phone numbers to a single agent. For each call, Layercode stores the from/to phone numbers (and country codes) on the session. See the [REST API](/api-reference/rest-api#sessions) for retrieving these details along with transcripts and recordings. 6. Test by calling the number. For a quick check, set a short welcome message in Layercode (for example, "Hello from Layercode"). 7. To run Twilio in production, you will need a backend where you can run your LLM flow. You should review one of our backend tutorials, for example, check out our [Next.js quick start](/tutorials/getting-started.mdx). And you can consult the [reference on webhooks](/api-reference/webhook-sse-api#webhook-request-payload) to see how you can receive the `from_phone_number` and `to_phone_number`. # Tool calling in Next.js with Layercode Source: https://docs.layercode.com/how-tos/tool-calling-js How to setup tool calling in Next.js with Layercode and ai sdk. Here's how to set up tool calling in Next.js. Make sure you have `ai` and `zod` installed. ### Install ai sdk and zod ```bash npm theme={null} npm install ai zod ``` ```bash pnpm theme={null} pnpm install ai zod ``` ```bash yarn theme={null} yarn install ai zod ``` ```bash bun theme={null} bun install ai zod ``` In your backend, where your agent is running, import `tool` and `stepCountIs` from `ai` and import `zod`. Note: you probably already imported `streamText` and `ModelMessage` ```ts theme={null} import { streamText, ModelMessage, tool, stepCountIs } from 'ai'; import z from 'zod' ``` Inside the callback of your layercode `streamResponse` in the case of a message received, initialize a tool. For instance, `weather` ```ts theme={null} const weather = tool({ description: 'Get the weather in a location', inputSchema: z.object({ location: z.string().describe('The location to get the weather for') }), execute: async ({ location }) => ({ location, temperature: 72 + Math.floor(Math.random() * 21) }) }); ``` Then set ```ts theme={null} tools: { weather } ``` as a property inside `streamText` You should also set these properties ```ts theme={null} toolChoice: 'auto', stopWhen: stepCountIs(10), ``` You can find more info in the [ai sdk docs](https://ai-sdk.dev/docs/ai-sdk-core/tools-and-tool-calling). Once you have this, make sure your prompt mentions the tool is available. For example add "you can use the weather tool to find the weather for a given location." And now, it should let you query the weather and you'll see it's a different temperature (between 72 and 92) each time because it has some randomness in the function. ## Next steps: telling the user that tool calling is happening One thing many many developers wish to do is update the user that tool calling is happening so they don't expect an immediate response. To do this, your tools can notify the client that there is a tool call happening. This guide will show you [how you can do that](/how-tos/sending-data-to-client). ## Sending speech to the user to tell them a call is happening. If you anticipate a long tool call, you may want to send a spoken message to them, such as "just a moment, let me grab that for you.". With ai sdk, you can do that by calling Layercode's stream.tts at the start of your `execute` function. Note that the tool must be defined inside your Layercode streamResponse callback function so that it has access to `stream`. ```ts theme={null} const weather = tool({ description: 'Get the weather in a location', inputSchema: z.object({ location: z.string().describe('The location to get the weather for') }), execute: async ({ location }) => { stream.tts("Just a moment, let me grab that for you."); // do something to get the weather return { location, temperature: 72 + Math.floor(Math.random() * 21) - 10 }; } }); ``` # Troubleshooting Next.js Source: https://docs.layercode.com/how-tos/troubleshooting-nextjs Some releavant tips and gotchas when building with Next.js and Layercode ### Use dynamic imports for Layercode hooks For instance ```tsx theme={null} 'use client'; import dynamic from 'next/dynamic'; // Dynamically import the VoiceAgent component with SSR disabled const VoiceAgent = dynamic(() => import('./ui/VoiceAgent'), { ssr: false }); export default function Home() { return ; } ``` You can see [an example here](https://github.com/layercodedev/fullstack-nextjs-cloudflare/blob/faa51f42b21be71cf488961d0df2f9a3a8e88ed8/app/page.tsx#L4) # Create a Cloudflare tunnel for webhooks Source: https://docs.layercode.com/how-tos/tunnelling Expose your local backend to Layercode using Cloudflare Tunnel. Layercode needs to send a webhook to your backend to generate agent responses. If you're running your backend locally, you'll need to expose it to the internet using a tunnel service. ## Setting up a tunnel with cloudflared We recommend using [cloudflared](https://developers.cloudflare.com/cloudflare-one/connections/connect-networks/do-more-with-tunnels/trycloudflare/), which is free for development. * **macOS:** `brew install cloudflared` * **Windows:** `winget install --id Cloudflare.cloudflared` * [Other platforms](https://developers.cloudflare.com/cloudflare-one/connections/connect-networks/downloads/) Run the following command to expose your local server: ```bash theme={null} cloudflared tunnel --url http://localhost:YOUR_PORT ``` After starting, cloudflared will print a public URL in your terminal, e.g.: ``` https://my-tunnel-name.trycloudflare.com ``` Add the path of your backend's webhook endpoint to the URL, e.g.: ``` https://my-tunnel-name.trycloudflare.com/api/agent ``` `/api/agent` is just an example. Your actual endpoint may be different depending on your backend configuration. 1. Go to the [Layercode dashboard](https://dash.layercode.com). 2. Click on your agent. 3. Click the Edit button in the 'Your Backend' box. 4. Enter your Webhook URL (from the previous step) and ensure your `LAYERCODE_WEBHOOK_SECRET` matches your environment variable. Open the agent Playground tab and start speaking to your voice agent! If you're having trouble, make sure your backend server is running and listening on the specified port (e.g., 3000). You can also visit the Webhook Logs tab in the agent to see the webhook requests being sent and any errors returned. Every time you restart the cloudflared tunnel, the assigned public URL will change. Be sure to update the webhook URL in the Layercode dashboard each time you restart the tunnel. ## Alternative Tunneling Solutions Besides cloudflared, you can also use other tunneling solutions like [ngrok](https://ngrok.com/) to expose your local backend. ## If using Vite: By default, Vite blocks requests from other hosts, so you will need to add your Cloudflared (or ngrok, etc.) address to `vite.config.ts` in `server.allowedHosts`. For example: ```ts theme={null} allowedHosts: ["suggesting-sri-pair-hugh.trycloudflare.com"] ``` # Node.js Backend SDK Source: https://docs.layercode.com/sdk-reference/node-js-sdk API reference for the Layercode Node.js Backend SDK. [layercode-node-server-sdk](https://github.com/layercodedev/layercode-node-server-sdk). ## Introduction The Layercode Node.js Backend SDK provides a simple way to handle the Layercode webhook in your backend. In particular, it makes it easy to return SSE events in the Layercode webhook response format. It supports all popular JavaScript runtime environments, including Node.js, Bun, and Cloudflare Workers. ## Installation ```bash theme={null} npm install @layercode/node-server-sdk ``` ## Usage ```typescript theme={null} import { streamResponse } from "@layercode/node-server-sdk"; //... inside your webhook request handler ... return streamResponse(request, async ({ stream }) => { stream.tts("Hi, how can I help you today?"); // This text will be sent to Layercode, converted to speech and spoken to the user // Call stream.tts() as many times as you need to send multiple pieces of speech to the user stream.end(); // This closes the stream and must be called at the end of your response }); // ... ``` ## Reference ### streamResponse The `streamResponse` function is the main entry point for the SDK. It takes the request body (from the Layercode webhook request) and a handler function as arguments. The handler function receives a `stream` object that can be used to send SSE events to the client. ```typescript theme={null} function streamResponse(requestBody: Record, handler: StreamResponseHandler): Response; ``` #### Parameters * `requestBody`: The request body from the client. See [Webhook Request Payload](/api-reference/webhook-sse-api#webhook-request-payload). * `handler`: An async function that receives a `stream` object. #### Stream Methods * `stream.tts(content: string)`: Sends a text to be spoken to the user (tts stands for text-to-speech). * `stream.data(content: any)`: Sends any arbitrary data to the frontend client. Use this for updating your frontend UI. * `stream.end()`: Closes the stream. Must be called at the end of your response. # Python Backend SDK Source: https://docs.layercode.com/sdk-reference/python-sdk API reference for the Layercode Python Backend SDK. We're working on the Python SDK. But the Layercode webhook is simple enough, that you can implement it in just a few lines of code. See the [FastAPI Backend Guide](/backend-guides/fastapi) for a full walkthrough. # React Frontend SDK Source: https://docs.layercode.com/sdk-reference/react-sdk Connect your React application to Layercode agents and build web and mobile voice AI applications. [layercode-react-sdk](https://github.com/layercodedev/layercode-react-sdk). ## useLayercodeAgent Hook The `useLayercodeAgent` hook provides a simple way to connect your React app to a Layercode agent, handling audio streaming, playback, and real-time communication. ```typescript useLayercodeAgent Hook theme={null} import { useLayercodeAgent } from "@layercode/react"; // Connect to a Layercode agent const { // Methods triggerUserTurnStarted, triggerUserTurnFinished, sendClientResponseText, // State status, userAudioAmplitude, agentAudioAmplitude, } = useLayercodeAgent({ agentId: "your-agent-id", authorizeSessionEndpoint: "/api/authorize", conversationId: "optional-conversation-id", // optional metadata: { userId: "user-123" }, // optional onConnect: ({ conversationId }) => console.log("Connected to agent", conversationId), onDisconnect: () => console.log("Disconnected from agent"), onError: (error) => console.error("Agent error:", error), onDataMessage: (data) => console.log("Received data:", data), }); ``` ## Hook Options The ID of your Layercode agent. The endpoint to authorize the session (should return a `client_session_key` and `session_id`). Note: From Mon Sep 1, 12:00 UTC, the response will include `conversation_id` instead of `session_id` (see REST API page). The conversation ID to resume a previous conversation (optional). Any metadata included here will be passed along to your backend with all webhooks. Callback when the connection is established. Receives an object: `{ conversationId: string | null }`. Callback when the connection is closed. Callback when an error occurs. Receives an `Error` object. Callback for custom data messages from the server (see `response.data` events from your backend). ## Return Values The `useLayercodeAgent` hook returns an object with the following properties: ### State The connection status. One of `"initializing"`, `"disconnected"`, `"connecting"`, `"connected"`, or `"error"`. Real-time amplitude of the user's microphone input (0-1). Useful for animating UI when the user is speaking. Real-time amplitude of the agent's audio output (0-1). Useful for animating UI when the agent is speaking. ### Turn-taking (Push-to-Talk) Layercode supports both automatic and push-to-talk turn-taking. For push-to-talk, use these methods to signal when the user starts and stops speaking: **triggerUserTurnStarted(): void** Signals that the user has started speaking (for [push-to-talk mode](/explanations/turn-taking#push-to-talk-mode)). Interrupts any agent audio playback. **triggerUserTurnFinished(): void** Signals that the user has finished speaking (for [push-to-talk mode](/explanations/turn-taking#push-to-talk-mode)). ### Text messages Use this method when the user submits a chat-style message instead of speaking. **sendClientResponseText(text: string): void** Ends the active user turn and forwards `text` to the agent. The `user.transcript` event is emitted before the agent responds, keeping UI components in sync. ## Notes & Best Practices * The hook manages microphone access, audio streaming, and playback automatically. * The `metadata` option allows you to set custom data which is then passed to your backend webhook (useful for user/session tracking). * The `conversationId` can be used to resume a previous conversation, or omitted to start a new one. ### Authorizing Sessions To connect a client (browser) to your Layercode voice agent, you must first authorize the session. The SDK will automatically send a POST request to the path (or url if your backend is on a different domain) passed in the `authorizeSessionEndpoint` option. In this endpoint, you will need to call the Layercode REST API to generate a `client_session_key` and `conversation_id` (if it's a new conversation). If your backend is on a different domain, set `authorizeSessionEndpoint` to the full URL (e.g., `https://your-backend.com/api/authorize`). **Why is this required?** Your Layercode API key should never be exposed to the frontend. Instead, your backend acts as a secure proxy: it receives the frontend's request, then calls the Layercode authorization API using your secret API key, and finally returns the `client_session_key` to the frontend. This also allows you to authenticate your user, and set any additional metadata that you want passed to your backend webhook. **How it works:** 1. **Frontend:** The SDK automatically sends a POST request to your `authorizeSessionEndpoint` with a request body. 2. **Your Backend:** Your backend receives this request, then makes a POST request to the Layercode REST API `/v1/agents/web/authorize_session` endpoint, including your `LAYERCODE_API_KEY` as a Bearer token in the headers. 3. **Layercode:** Layercode responds with a `client_session_key` (and a `conversation_id`), which your backend returns to the frontend. 4. **Frontend:** The SDK uses the `client_session_key` to establish a secure WebSocket connection to Layercode. **Example backend authorization endpoint code:** ```ts Next.js app/api/authorize/route.ts [expandable] theme={null} export const dynamic = "force-dynamic"; import { NextResponse } from "next/server"; export const POST = async (request: Request) => { // Here you could do any user authorization checks you need for your app const endpoint = "https://api.layercode.com/v1/agents/web/authorize_session"; const apiKey = process.env.LAYERCODE_API_KEY; if (!apiKey) { throw new Error("LAYERCODE_API_KEY is not set."); } const requestBody = await request.json(); if (!requestBody || !requestBody.agent_id) { throw new Error("Missing agent_id in request body."); } try { const response = await fetch(endpoint, { method: "POST", headers: { "Content-Type": "application/json", Authorization: `Bearer ${apiKey}`, }, body: JSON.stringify(requestBody), }); if (!response.ok) { const text = await response.text(); throw new Error(text || response.statusText); } return NextResponse.json(await response.json()); } catch (error: any) { console.log("Layercode authorize session response error:", error.message); return NextResponse.json({ error: error.message }, { status: 500 }); } }; ``` ```ts Hono theme={null} import { Context } from 'hono'; import { env } from 'cloudflare:workers'; export const onRequestPost = async (c: Context) => { try { const response = await fetch("https://api.layercode.com/v1/agents/web/authorize_session", { method: 'POST', headers: { 'Content-Type': 'application/json', Authorization: `Bearer ${env.LAYERCODE_API_KEY}`, }, body: JSON.stringify({ agent_id: "your-agent-id", conversation_id: null }), }); if (!response.ok) { console.log('response not ok', response.statusText); return c.json({ error: response.statusText }); } const data: { client_session_key: string; conversation_id: string; config?: Record } = await response.json(); return c.json(data); } catch (error) { return c.json({ error: error }); } }; ``` ```ts ExpressJS theme={null} import type { RequestHandler } from 'express'; export const onRequestPost: RequestHandler = async (req, res) => { try { const response = await fetch("https://api.layercode.com/v1/agents/web/authorize_session", { method: 'POST', headers: { 'Content-Type': 'application/json', Authorization: `Bearer ${process.env.LAYERCODE_API_KEY}`, }, body: JSON.stringify({ agent_id: "your-agent-id", conversation_id: null }), }); if (!response.ok) { console.log('response not ok', response.statusText); return res.status(500).json({ error: response.statusText }); } const data: { client_session_key: string; conversation_id: string; config?: Record } = await response.json(); res.json(data); } catch (error) { res.status(500).json({ error: (error as Error).message }); } }; ``` ```python Python theme={null} import os import httpx from fastapi.responses import JSONResponse @app.post("/authorize") async def authorize_endpoint(request: Request): api_key = os.getenv("LAYERCODE_API_KEY") if not api_key: return JSONResponse({"error": "LAYERCODE_API_KEY is not set."}, status_code=500) try: body = await request.json() except Exception: return JSONResponse({"error": "Invalid JSON body."}, status_code=400) if not body or not body.get("agent_id"): return JSONResponse({"error": "Missing agent_id in request body."}, status_code=400) endpoint = "https://api.layercode.com/v1/agents/web/authorize_session" try: async with httpx.AsyncClient() as client: response = await client.post( endpoint, headers={ "Content-Type": "application/json", "Authorization": f"Bearer {api_key}", }, json=body, ) if response.status_code != 200: return JSONResponse({"error": response.text}, status_code=500) return JSONResponse(response.json()) except Exception as error: print("Layercode authorize session response error:", str(error)) return JSONResponse({"error": str(error)}, status_code=500) ``` # Vanilla JS Frontend SDK Source: https://docs.layercode.com/sdk-reference/vanilla-js-sdk API reference for the Layercode Vanilla JS Frontend SDK. [layercode-js-sdk](https://github.com/layercodedev/layercode-js-sdk). ## LayercodeClient The `LayercodeClient` is the core client for all JavaScript frontend SDKs, providing audio recording, playback, and real-time communication with the Layercode agent. ```javascript theme={null} import LayercodeClient from "https://cdn.jsdelivr.net/npm/@layercode/js-sdk@latest/dist/layercode-js-sdk.esm.js"; window.layercode = new LayercodeClient({ agentId: "your-agent-id", conversationId: "your-conversation-id", // optional authorizeSessionEndpoint: "/api/authorize", metadata: { userId: "123" }, // optional onConnect: ({ conversationId }) => console.log("connected", conversationId), onDisconnect: () => console.log("disconnected"), onError: (err) => console.error("error", err), onDataMessage: (msg) => console.log("data message", msg), onUserAmplitudeChange: (amp) => console.log("user amplitude", amp), onAgentAmplitudeChange: (amp) => console.log("agent amplitude", amp), onStatusChange: (status) => console.log("status", status), }); window.layercode.connect(); ``` ### Usage Example ### Constructor Options Options for the LayercodeClient. The ID of your Layercode agent. The conversation ID to resume a previous conversation (optional). The endpoint to authorize the session (should return a `client_session_key` and `session_id`). Note: From Mon Sep 1, 12:00 UTC, the response will include `conversation_id` instead of `session_id` (see REST API page). Optional metadata to send with the session authorization request. Callback when the client connects. Receives an object: `{ conversationId: string | null }`. Callback when the client disconnects. Callback when an error occurs. Receives an `Error` object. Callback for custom data messages from the server. Callback for changes in the user's microphone amplitude (number, 0-1). Callback for changes in the agent's audio amplitude (number, 0-1). Callback when the client's status changes. Receives a string: `"disconnected" | "connecting" | "connected" | "error"`. ### Methods **connect(): Promise\** Connects to the Layercode agent, authorizes the session, and starts audio capture and playback. **disconnect(): Promise\** Disconnects from the Layercode agent, stops audio capture and playback, and closes the WebSocket. ### Turn-taking (Push-to-Talk) Layercode supports both automatic and push-to-talk turn-taking. For push-to-talk, use these methods to signal when the user starts and stops speaking: **triggerUserTurnStarted(): Promise\** Signals that the user has started speaking (for [push-to-talk mode](/explanations/turn-taking#push-to-talk-mode)). Interrupts any agent audio playback. **triggerUserTurnFinished(): Promise\** Signals that the user has finished speaking (for [push-to-talk mode](/explanations/turn-taking#push-to-talk-mode)). ### Text messages Use this method when the user submits a chat-style message instead of speaking. **sendClientResponseText(text: string): void** Ends the active user turn and forwards `text` to the agent. The `user.transcript` event is emitted before the agent responds, keeping UI components in sync. ## Events & Callbacks * **onConnect**: Called when the connection is established. Receives `{ conversationId }`. * **onDisconnect**: Called when the connection is closed. * **onError**: Called on any error (authorization, WebSocket, audio, etc). * **onDataMessage**: Called when a custom data message is received from the server (see `response.data` events from your backend). * **onUserAmplitudeChange**: Called with the user's microphone amplitude (0-1). * **onAgentAmplitudeChange**: Called with the agent's audio amplitude (0-1). * **onStatusChange**: Called when the status changes (`"disconnected"`, `"connecting"`, `"connected"`, `"error"`). Callback when the active input/output device changes in the browser. Useful for handling device disconnects and switches. Callback when VAD detects user speech start/stop. Receives a boolean. ## Notes & Best Practices * The SDK manages microphone access, audio streaming, and playback automatically. * The `metadata` option allows you to set custom data which is then passed to your backend webhook (useful for user/session tracking). * The `conversationId` can be used to resume a previous conversation, or omitted to start a new one. ### Authorizing Sessions To connect a client (browser) to your Layercode voice agent, you must first authorize the session. The SDK will automatically send a POST request to the path (or url if your backend is on a different domain) passed in the `authorizeSessionEndpoint` option. In this endpoint, you will need to call the Layercode REST API to generate a `client_session_key` and `conversation_id` (if it's a new conversation). If your backend is on a different domain, set `authorizeSessionEndpoint` to the full URL (e.g., `https://your-backend.com/api/authorize`). **Why is this required?** Your Layercode API key should never be exposed to the frontend. Instead, your backend acts as a secure proxy: it receives the frontend's request, then calls the Layercode authorization API using your secret API key, and finally returns the `client_session_key` to the frontend. This also allows you to authenticate your user, and set any additional metadata that you want passed to your backend webhook. **How it works:** 1. **Frontend:** The SDK automatically sends a POST request to your `authorizeSessionEndpoint` with a request body. 2. **Your Backend:** Your backend receives this request, then makes a POST request to the Layercode REST API `/v1/agents/web/authorize_session` endpoint, including your `LAYERCODE_API_KEY` as a Bearer token in the headers. 3. **Layercode:** Layercode responds with a `client_session_key` (and a `conversation_id`), which your backend returns to the frontend. 4. **Frontend:** The SDK uses the `client_session_key` to establish a secure WebSocket connection to Layercode. **Example backend authorization endpoint code:** ```ts Next.js app/api/authorize/route.ts [expandable] theme={null} export const dynamic = "force-dynamic"; import { NextResponse } from "next/server"; export const POST = async (request: Request) => { // Here you could do any user authorization checks you need for your app const endpoint = "https://api.layercode.com/v1/agents/web/authorize_session"; const apiKey = process.env.LAYERCODE_API_KEY; if (!apiKey) { throw new Error("LAYERCODE_API_KEY is not set."); } const requestBody = await request.json(); if (!requestBody || !requestBody.agent_id) { throw new Error("Missing agent_id in request body."); } try { const response = await fetch(endpoint, { method: "POST", headers: { "Content-Type": "application/json", Authorization: `Bearer ${apiKey}`, }, body: JSON.stringify(requestBody), }); if (!response.ok) { const text = await response.text(); throw new Error(text || response.statusText); } return NextResponse.json(await response.json()); } catch (error: any) { console.log("Layercode authorize session response error:", error.message); return NextResponse.json({ error: error.message }, { status: 500 }); } }; ``` ```ts Hono theme={null} import { Context } from 'hono'; import { env } from 'cloudflare:workers'; export const onRequestPost = async (c: Context) => { try { const response = await fetch("https://api.layercode.com/v1/agents/web/authorize_session", { method: 'POST', headers: { 'Content-Type': 'application/json', Authorization: `Bearer ${env.LAYERCODE_API_KEY}`, }, body: JSON.stringify({ agent_id: "your-agent-id", conversation_id: null }), }); if (!response.ok) { console.log('response not ok', response.statusText); return c.json({ error: response.statusText }); } const data: { client_session_key: string; conversation_id: string; config?: Record } = await response.json(); return c.json(data); } catch (error) { return c.json({ error: error }); } }; ``` ```ts ExpressJS theme={null} import type { RequestHandler } from 'express'; export const onRequestPost: RequestHandler = async (req, res) => { try { const response = await fetch("https://api.layercode.com/v1/agents/web/authorize_session", { method: 'POST', headers: { 'Content-Type': 'application/json', Authorization: `Bearer ${process.env.LAYERCODE_API_KEY}`, }, body: JSON.stringify({ agent_id: "your-agent-id", conversation_id: null }), }); if (!response.ok) { console.log('response not ok', response.statusText); return res.status(500).json({ error: response.statusText }); } const data: { client_session_key: string; conversation_id: string; config?: Record } = await response.json(); res.json(data); } catch (error) { res.status(500).json({ error: (error as Error).message }); } }; ``` ```python Python theme={null} import os import httpx from fastapi.responses import JSONResponse @app.post("/authorize") async def authorize_endpoint(request: Request): api_key = os.getenv("LAYERCODE_API_KEY") if not api_key: return JSONResponse({"error": "LAYERCODE_API_KEY is not set."}, status_code=500) try: body = await request.json() except Exception: return JSONResponse({"error": "Invalid JSON body."}, status_code=400) if not body or not body.get("agent_id"): return JSONResponse({"error": "Missing agent_id in request body."}, status_code=400) endpoint = "https://api.layercode.com/v1/agents/web/authorize_session" try: async with httpx.AsyncClient() as client: response = await client.post( endpoint, headers={ "Content-Type": "application/json", "Authorization": f"Bearer {api_key}", }, json=body, ) if response.status_code != 200: return JSONResponse({"error": response.text}, status_code=500) return JSONResponse(response.json()) except Exception as error: print("Layercode authorize session response error:", str(error)) return JSONResponse({"error": str(error)}, status_code=500) ``` # Quick start Source: https://docs.layercode.com/tutorials/getting-started Create your first AI voice agent in minutes. Let's build a **production-ready voice agent** that can **respond to users** and trigger **function calls**. Here's a [preview](https://live-website-demo.layercode.workers.dev/) of what you will build: * Real-time **speech-to-text**, **text-to-speech**, **turn-taking**, and **low-latency audio delivery** powered by Layercode's edge platform. * A sample **agent backend** that monitors conversation transcripts and responds based on your prompt and the **tool calls** you define. Deployable anywhere. ```bash theme={null} npx @layercode/cli init ``` After the CLI finishes, you will see something like this: Outcome of the CLI ### Try it out Once everything boots, you can start a conversation locally (typically at [http://localhost:3000/](http://localhost:3000/)). Layercode running a conversation ## Video walkthrough Prefer a walkthrough? Watch the quick demo below.