LayercodeClient
TheLayercodeClient is the core client for all JavaScript frontend SDKs, providing audio recording, playback, and real-time communication with the Layercode agent.
Usage Example
Callconnect() to start the session once the user is ready, and invoke disconnect() when you want to tear the connection down.
Constructor Options
Options for the LayercodeClient.
The ID of your Layercode agent.
The conversation ID to resume a previous conversation (optional).
The endpoint to authorize the session. It must return a JSON object:
.
Optional metadata to send with the session authorization request.
Whether microphone capture should start immediately. Defaults to
true. Set to false to initialize the client in text-only mode and defer the permission prompt.Whether agent audio should play through the browser immediately. Defaults to
true. Set to false to keep the connection active while holding back speaker playback until you opt in.Whether microphone and speaker amplitude monitoring should run. Defaults to
true. Disable this when audioInput starts as false to avoid unnecessary audio processing.Milliseconds before resuming agent audio after a temporary pause due to a false interruption. Defaults to
500.Callback triggered when the audio input state changes. Receives a boolean.
Callback triggered when the audio output state changes. Receives a boolean indicating whether the agent audio is audible.
Callback when the client connects. Receives .
Use
config to inspect the effective agent configuration returned from authorizeSessionEndpoint.Callback when the client disconnects.
Callback when an error occurs. Receives an
Error object.Callback for all non-audio messages from the server (excludes
response.audio).Callback for custom data messages from the server (typically
response.data events).Callback for changes in the user’s microphone amplitude (number, 0–1).
Callback for changes in the agent’s audio amplitude (number, 0–1).
Callback when the client’s status changes. Receives a string:
“disconnected” | “connecting” | “connected” | “error”.Callback when the SDK detects user speech start/stop. Receives a boolean.
Callback when the agent starts or stops speaking. Receives a boolean.
Callback when the client is muted or unmuted. Receives a boolean representing the new muted state.
Callback when the active input device changes in the browser. Receives the active deviceId.
Callback when available input devices change (hot-plug).
Receives .
Methods
connect(): Promise<void> Connects to the Layercode agent, authorizes the session, and starts audio capture and playback.
disconnect(): Promise<void> Disconnects from the Layercode agent, stops audio capture and playback, and closes the WebSocket.
setAudioInput(state: boolean): Promise<void> Enables or disables microphone capture without tearing down the WebSocket. Pass
true when the user opts into voice mode; pass false to drop back to text-only mode.setAudioOutput(state: boolean): Promise<void> Enables or disables local agent playback without interrupting the active session.
Tip: CallsetAudioOutput(false)to silence agent audio while another clip plays in your UI, then restore playback withsetAudioOutput(true)when you’re ready. Pair this with theaudioOutputChangedcallback to keep toggles in sync.
setInputDevice(deviceId: string): Promise<void> Switches the microphone input. Pass
‘default’ (or an empty string) to use the system default device. Returns available input devices, marking the default one.
getStream(): MediaStream | null Returns the active microphone
MediaStream, or null if not initialized.mute(): void Stop sending mic audio to the server without tearing down the stream/connection.
unmute(): void Resume sending mic audio to the server.
Turn-taking (Push-to-Talk)
Layercode supports both automatic and push-to-talk turn-taking. For push-to-talk, use these methods to signal when the user starts and stops speaking:triggerUserTurnStarted(): Promise<void> Signals that the user has started speaking (for push-to-talk mode). Interrupts any agent audio playback.
triggerUserTurnFinished(): Promise<void> Signals that the user has finished speaking (for push-to-talk mode).
Text messages
sendClientResponseText(text: string): Promise<void> Ends the active user turn, interrupts agent playback, and forwards
text to the agent. The server will emit user.transcript before the agent responds, keeping UI components in sync.sendClientResponseData(payload: Record<string, any>): void Sends a JSON-serializable
payload to your agent backend without affecting the current turn. The data surfaces as a data webhook event. See docs page: Send JSON data from the client.Events & Callbacks
Properties
Connection status:
“disconnected” | “connecting” | “connected” | “error”.Indicates whether microphone capture is currently enabled.
Indicates whether agent audio playback is currently enabled for this client.
Whether the voice activity detector currently hears the user speaking.
Whether the agent is presently speaking (based on audio replay).
User mic amplitude (0–1). Non-zero only when amplitude monitoring is enabled.
Agent playback amplitude (0–1). Non-zero only when amplitude monitoring is enabled.
Whether the microphone track is muted.
The conversation ID the client is currently attached to, if any.
Agent Config (from authorizeSessionEndpoint)
OnonConnect, you receive { conversationId, config }. Relevant fields:
transcription.trigger === 'automatic' and vad.enabled !== false, the SDK initializes MicVAD and gates mic audio accordingly.
gate_audio controls whether audio is sent only while speaking; buffer_frames controls the “pre-speech” buffer flushed when speech starts.
Notes & Best Practices
- The SDK manages microphone access, audio streaming, and playback automatically.
- Use
audioInput: falseplussetAudioInput(true)to defer the browser permission prompt until the user explicitly switches to voice. Disable amplitude monitoring at the same time to avoid unnecessary processing. - The
metadataoption allows you to set custom data which is then passed to your backend webhook (useful for user/session tracking). - The
conversationIdcan be used to resume a previous conversation, or omitted to start a new one.
Small device example
Authorizing Sessions
To connect a client (browser) to your Layercode voice agent, you must first authorize the session. The SDK will automatically send a POST request to the path (or url if your backend is on a different domain) passed in theauthorizeSessionEndpoint option. In this endpoint, you will need to call the Layercode REST API to generate a client_session_key and conversation_id (if it’s a new conversation).
If your backend is on a different domain, set
authorizeSessionEndpoint to the full URL (e.g., https://your-backend.com/api/authorize).client_session_key to the frontend.
This also allows you to authenticate your user, and set any additional metadata that you want passed to your backend webhook.
How it works:
-
Frontend:
The SDK automatically sends a POST request to your
authorizeSessionEndpointwith a request body. -
Your Backend:
Your backend receives this request, then makes a POST request to the Layercode REST API
/v1/agents/web/authorize_sessionendpoint, including yourLAYERCODE_API_KEYas a Bearer token in the headers. -
Layercode:
Layercode responds with a
client_session_key(and aconversation_id), which your backend returns to the frontend. -
Frontend:
The SDK uses the
client_session_keyto establish a secure WebSocket connection to Layercode.
Custom Authorization
Use the optionalauthorizeSessionRequest function when you need to control how authorization credentials are exchanged with your backend (for example, to add custom headers or reuse an existing HTTP client).
Custom Authorization example
authorizeSessionRequest, the client falls back to a standard fetch call that POSTs the JSON body to authorizeSessionEndpoint.
Request payload
agent_id– ID of the agent to connect.metadata– metadata supplied when instantiating the client.sdk_version– version string of the JavaScript SDK.conversation_id– present only when reconnecting to an existing conversation.
Troubleshooting
AudioWorklet InvalidStateError on first connect
Browsers freeze a freshly createdAudioContext until the user interacts with the page. If your app calls connect() during component initialization, audioWorklet.addModule() pauses, the worklet never registers, and the player throws InvalidStateError: AudioWorklet does not have a valid AudioWorkletGlobalScope as soon as audio starts streaming.
To avoid the race:
- Gate the first
connect()behind a user gesture (click, tap, key press)—for example, a “Start voice agent” button. - Await
connect()inside that handler; keep any teardown logic (likedisconnect()) in lifecycle cleanup. - Expect this to show up most during rapid local reloads or with React Strict Mode double-mounting. In real usage, the user normally clicks before the agent activates, so reconnects after the first gesture are safe.