How to set the audio format for output using output_audio_codec

The output_audio_codec parameter defines the audio format for the streamed speech output. It must be set in the config message before sending any text.

If not specified, the audio is streamed in base64-encoded MPEG format by default.

Choosing the appropriate codec can impact:

Audio quality
File size
Playback compatibility
Latency

Supported Audio Codecs

Codec	Description
`mp3`	MPEG Layer-3 – widely supported, good compression
`aac`	Advanced Audio Coding – good compression, high quality
`alaw`	8-bit logarithmic PCM – used in telephony
`flac`	Lossless format – high fidelity audio
`linear16`	Uncompressed PCM audio – large size, accurate
`mulaw`	Similar to alaw, used in telephony
`opus`	Optimized for speech and streaming
`wav`	Standard uncompressed format, large files

Example Code

Rest API

Streaming API

1 from sarvamai import SarvamAI
2 from sarvamai.play import save
3 
4 # Initialize the REST client
5 client = SarvamAI(api_subscription_key="YOUR_API_SUBSCRIPTION_KEY")
6 
7 # Generate speech using REST
8 audio = client.text_to_speech.convert(
9     text="Welcome to Sarvam AI!",
10     target_language_code="en-IN",
11     output_audio_codec="aac"
12 )
13 save(audio, "output1.aac")