How to select output mode | Sarvam API Docs

Saaras v3 supports multiple output modes to handle different transcription and translation needs. Use the mode parameter to specify how you want the audio processed.

The mode parameter is only available for Saaras v3. For legacy Saarika v2.5, only basic transcription is supported.

Output Mode Comparison

For the same input audio saying: “मेरा फोन नंबर है 9840950950” (My phone number is 9840950950)

Mode	Description	Example Output
`transcribe`	Standard transcription with number normalization	`मेरा फोन नंबर है 9840950950`
`translate`	Translate to English	`My phone number is 9840950950`
`verbatim`	Exact word-for-word, preserves spoken numbers	`मेरा फोन नंबर है नौ आठ चार zero नौ पांच zero नौ पांच zero`
`translit`	Romanized/Latin script	`mera phone number hai 9840950950`
`codemix`	English words in English, Indic words in native script	`मेरा phone number है 9840950950`

When to Use Each Mode

Mode	Best For
`transcribe`	Call recordings, meetings, voice notes, general transcription
`translate`	Analytics dashboards, English-only systems, international teams
`verbatim`	Legal transcriptions, compliance, preserving exact spoken content
`translit`	Systems that only support Latin characters, search indexing
`codemix`	Hinglish conversations, mixed-language customer support

Example Code

Transcribe

Translate

Verbatim

Translit

Codemix

1 from sarvamai import SarvamAI
2 
3 client = SarvamAI(api_subscription_key="YOUR_SARVAM_API_KEY")
4 
5 # Standard transcription in original language
6 response = client.speech_to_text.transcribe(
7     file=open("audio.wav", "rb"),
8     model="saaras:v3",
9     language_code="hi-IN",
10     mode="transcribe"  # Default mode
11 )
12 
13 print(response.transcript)
14 # Output: मेरा फोन नंबर है 9840950950