Skip to main content
Layercode keeps speech recognition modular so you can match the right engine to each pipeline. Today we offer Deepgram’s latest streaming stack as the single speech-to-text integration available in production, delivering the fastest and most accurate in-call transcription we support.

Deepgram Nova-3 (primary streaming)

  • Model: nova-3, Deepgram’s flagship speech-to-text model tuned for high-accuracy, low-latency conversational AI.
  • Real-time features: Smart formatting, interim hypotheses, and MIP opt-out are all enabled to optimize conversational turn taking out of the box. Nova-3’s latency profile keeps responses within the sub-second expectations of interactive agents.
  • Audio formats: We normalize audio to 8 kHz linear PCM or μ-law depending on transport requirements, so Nova-3 receives the clean signal it expects in both browser and telephony scenarios.
  • Multilingual: For non-English conversations, select Deepgram’s nova-3 multilingual model, which currently covers English, Spanish, French, German, Hindi, Russian, Portuguese, Japanese, Italian, and Dutch while preserving interim and final transcripts in the source language. Let us know if you need other languages supported.
  • Flux coverage: Deepgram’s flux model currently supports English only for speech-to-text, so keep Nova-3 multilingual selected for other languages.
  • Connectivity: Choose the managed Cloudflare route (provider deepgram) or connect directly to Deepgram (deepgram_cloud). We merge your configuration with our defaults and set the appropriate authorization headers for each mode.
Add your Deepgram API key in Settings → Providers to unlock direct Deepgram access. If you do not supply a key, the pipeline uses our Cloudflare-managed path with the same Nova-3 model. Deepgram Nova-3 combines the accuracy improvements announced in their Nova-3 launch with Layercode’s interruption handling to keep transcripts actionable in real time.