How to set the audio format for output using output_audio_codec

The output_audio_codec parameter defines the audio format for the streamed speech output. It must be set in the config message before sending any text.

If not specified, the audio is streamed in base64-encoded MPEG format by default.

Choosing the appropriate codec can impact:

  • Audio quality
  • File size
  • Playback compatibility
  • Latency

Supported Audio Codecs

CodecDescription
mp3MPEG Layer-3 – widely supported, good compression
aacAdvanced Audio Coding – good compression, high quality
alaw8-bit logarithmic PCM – used in telephony
flacLossless format – high fidelity audio
linear16Uncompressed PCM audio – large size, accurate
mulawSimilar to alaw, used in telephony
opusOptimized for speech and streaming
wavStandard uncompressed format, large files

Example Code

1from sarvamai import SarvamAI
2from sarvamai.play import save
3
4# Initialize the REST client
5client = SarvamAI(api_subscription_key="YOUR_API_SUBSCRIPTION_KEY")
6
7# Generate speech using REST
8audio = client.text_to_speech.convert(
9 text="Welcome to Sarvam AI!",
10 target_language_code="en-IN",
11 output_audio_codec="aac"
12)
13save(audio, "output1.aac")