Next.js
Build web voice agent experiences in Next.js with the Layercode React SDK.
Layercode makes it easy to build web-based voice agent applications in Next.js. In this guide we’ll walk you through a full-stack Next.js example voice agent, that lets users speak to a voice AI in their browser.
Example code: layercodedev/example-fullstack-nextjs
This frontend example is part of a full-stack example that also includes a web voice agent React frontend. We recommend reading the Next.js backend guide to get the most out of this example.
Setup
To get started, you’ll need a Layercode account and a voice pipeline you’ve created. If you haven’t done so yet, follow our Getting Started Guide.
Then follow the setup instructions in the repo README file.
How it works
Connect to a Layercode voice pipeline
We use the React SDK useLayercodePipeline hook which handles all the complexity required for real-time, low-latency, two-way voice agent interactions.
Here’s a simplified example of how to use the React SDK in a Next.js application:
The useLayercodePipeline hook accepts the following parameters:
- Your pipeline ID, found in the Layercode Dashboard
- The endpoint to authorize the client session (see Authorize Client Session)
- An optional callback function for handling data messages (not shown in example above)
On mount, the useLayercodePipeline hook will:
- Make a request to your authorize session endpoint to create new session and return the client session key. Here you can also do any user authorization checks you need for your app.
- Establish a WebSocket connection to Layercode (using the client session key)
- Capture microphone audio from the user and stream it to the Layercode voice pipeline for transcription
- (At this stage, Layercode will call the Hosted Backend or Your Backend webhook to generate a response, and then convert the response from text to speech)
- Playback audio of the voice agent’s response to the user in their browser, as it’s generated
The useLayercodePipeline hook returns an object with the following properties:
status
: The connection status of the voice agent. You can show this to the user to indicate the connection status.agentAudioAmplitude
: The amplitude of the audio from the voice agent. You can use this to drive an animation when the voice agent is speaking.
By default, your voice pipeline will handle turn taking in automatic mode. But you can configure your voice pipeline to use push to talk mode. If you are using push to talk mode see the push-to-talk instructions in the repo README and read about how the VoiceAgentPushToTalk component works below.
Authorizing Sessions
To connect a client (browser) to your Layercode voice pipeline, you must first authorize the session. The SDK will automatically send a POST request to the path (or url if your backend is on a different domain) passed in the authorizeSessionEndpoint
option. In this endpoint, you will need to call the Layercode REST API to generate a client_session_key
and session_id
(if it’s a new session).
authorizeSessionEndpoint
to the full URL (e.g., https://your-backend.com/api/authorize
).Why is this required?
Your Layercode API key should never be exposed to the frontend. Instead, your backend acts as a secure proxy: it receives the frontend’s request, then calls the Layercode authorization API using your secret API key, and finally returns the client_session_key
to the frontend.
This also allows you to authenticate your user, and set any additional metadata that you want passed to your backend webhook.
How it works:
-
Frontend: The SDK automatically sends a POST request to your
authorizeSessionEndpoint
with a request body. -
Your Backend: Your backend receives this request, then makes a POST request to the Layercode REST API
/v1/pipelines/authorize_session
endpoint, including yourLAYERCODE_API_KEY
as a Bearer token in the headers. -
Layercode: Layercode responds with a
client_session_key
(and asession_id
), which your backend returns to the frontend. -
Frontend: The SDK uses the
client_session_key
to establish a secure WebSocket connection to Layercode.
Example backend authorization endpoint code:
Components
AudioVisualization
The AudioVisualization
component is used to visualize the audio from the voice agent. It uses the agentAudioAmplitude
value returned from the useLayercodePipeline hook to drive the height of the audio bars with a simple animation.
ConnectionStatusIndicator
The ConnectionStatusIndicator
component is used to display the connection status of the voice agent. It uses the status
value returned from the useLayercodePipeline hook to display the connection status.
VoiceAgentPushToTalk (optional)
Because the useLayercodePipeline hook handles all of the audio streaming and playback, in most cases the microphone button is simply a visual aid and doesn’t implement any logic. A simple microphone icon inside a circle will suffice in most cases.
Layercode does support ‘push-to-talk’ turn taking, as an alternative to automatic turn taking (read more about turn taking). When using ‘push-to-talk’ turn taking, holding down and releasing the MicrophoneButton
must send a WebSocket message to tell Layercode the user has started and finished talking. In this example, we provide an alternative VoiceAgentPushToTalk
component, that along with the MicrophoneButtonPushToTalk
component, handles this logic.
To use this mode, you’ll need to edit app/page.tsx
to use the VoiceAgentPushToTalk
component instead of the VoiceAgent
component. Then in your Layercode Dashboard, you’ll need to click Edit in the Transcription section of your voice pipeline and set the Turn Taking to Push to Talk.