For AI agents: a documentation index is available at the root level at /llms.txt and /llms-full.txt. Append /llms.txt to any URL for a page-level index, or .md for the markdown version of any page.
CommunityAPI StatusAPI PricingSign Up
DocumentationAPI ReferencesCookbookIntegrationDeveloper Tools
DocumentationAPI ReferencesCookbookIntegrationDeveloper Tools
  • Getting Started
    • Welcome
    • Quickstart
    • SDKs & Libraries
    • Building for Indian Languages
    • Models
    • Credits & Rate Limits
    • Errors & Troubleshooting
    • Talk to us
    • Pricing
    • Changelog
  • API Guides & Tutorials
      • Overview
      • Which API to Use
      • Rest API
      • Pronunciation Dictionary
      • Best Practices
        • Set the Language
        • Change the Speaker Voice
        • Adjust the Tone
        • Adjust the Speed
        • Adjust the Loudness
        • Set the Sample rate
        • Enable Text Preprocessing
        • Set audio format for output
        • Set bitrate for output
        • Set maximum length for sentence splitting
        • Set buffer size to start processing
LogoLogo
CommunityAPI StatusAPI PricingSign Up
On this page
  • Supported Audio Codecs
  • Example Code
API Guides & TutorialsText to SpeechHow-to

How to set the audio format for output using output_audio_codec

||View as Markdown|
Was this page helpful?
Previous

How to set output_audio_bitrate

Next
Built with

The output_audio_codec parameter defines the audio format for the streamed speech output. It must be set in the config message before sending any text.

If not specified, the audio is streamed in base64-encoded MPEG format by default.

Choosing the appropriate codec can impact:

  • Audio quality
  • File size
  • Playback compatibility
  • Latency

Supported Audio Codecs

CodecDescription
mp3MPEG Layer-3 – widely supported, good compression
aacAdvanced Audio Coding – good compression, high quality
alaw8-bit logarithmic PCM – used in telephony
flacLossless format – high fidelity audio
linear16Uncompressed PCM audio – large size, accurate
mulawSimilar to alaw, used in telephony
opusOptimized for speech and streaming
wavStandard uncompressed format, large files

Example Code

Rest API
Streaming API
1from sarvamai import SarvamAI
2from sarvamai.play import save
3
4# Initialize the REST client
5client = SarvamAI(api_subscription_key="YOUR_SARVAM_API_KEY")
6
7# Generate speech using REST
8audio = client.text_to_speech.convert(
9 text="Welcome to Sarvam AI!",
10 model="bulbul:v3",
11 target_language_code="en-IN",
12 output_audio_codec="aac"
13)
14save(audio, "output1.aac")