Base System Prompt for Voice AI
Minimal base prompt for Voice AI
Pronunciation of numbers, dates & times
Pronunciation of numbers, dates, times, and special characters is also crucial for voice applications. TTS (text-to-speech) providers handle pronunciations in different ways. A good base prompt that guides the LLM to use words to spell out numbers, dates, addresses etc will work for common cases.Numbers & data rules
Keep long paragraphs sounding natural
Most text-to-speech systems will change prosody if they receive each sentence individually. If your voice agent needs to speak a large amount of text (e.g. a long legal disclosures or policy statements), follow this guidance to keep paragraphs sounding natural:- Send multiple sentences together. When you already have the full copy (for example, a static disclosure), pass the entire paragraph in a single
stream.ttsmessage so the speech engine can maintain the correct intonation. - Wait for the model to finish generating. Fast models such as Gemini 2.5 Flash Lite can produce a paragraph quickly. Instead of streaming each partial sentence as soon as it appears, collect the complete paragraph and then forward the whole string to the TTS provider.