For AI agents: a documentation index is available at the root level at /llms.txt and /llms-full.txt. Append /llms.txt to any URL for a page-level index, or .md for the markdown version of any page.
CommunityAPI StatusAPI PricingSign Up
DocumentationAPI ReferencesCookbookIntegrationDeveloper Tools
DocumentationAPI ReferencesCookbookIntegrationDeveloper Tools
  • Getting Started
    • Welcome
    • Quickstart
    • SDKs & Libraries
    • Building for Indian Languages
    • Models
    • Credits & Rate Limits
    • Errors & Troubleshooting
    • Talk to us
    • Pricing
    • Changelog
  • API Guides & Tutorials
      • Overview
      • Which API to Use
      • Rest API
      • Pronunciation Dictionary
      • Best Practices
        • Set the Language
        • Change the Speaker Voice
        • Adjust the Tone
        • Adjust the Speed
        • Adjust the Loudness
        • Set the Sample rate
        • Enable Text Preprocessing
        • Set audio format for output
        • Set bitrate for output
        • Set maximum length for sentence splitting
        • Set buffer size to start processing
LogoLogo
CommunityAPI StatusAPI PricingSign Up
On this page
  • Supported Bitrates
  • Example Streaming API code
API Guides & TutorialsText to SpeechHow-to

How to set output_audio_bitrate

||View as Markdown|
Was this page helpful?
Previous

How to set maximum length for sentence splitting using max_chunk_length

Next
Built with

The output_audio_bitrate parameter defines the bitrate of the audio stream in kilobits per second. It controls the audio quality, file size, and streaming performance.

This parameter is available only in the WebSocket Streaming API. It is not supported in the REST API.

This parameter is optional. If not provided, the system uses the default value 128k.

Supported Bitrates

ValueDescription
32kVery low bitrate, smaller file, lower quality
64kLow bitrate, suitable for speech
96kMedium quality
128kDefault – good balance of quality and size
192kHigh quality, larger audio files

Example Streaming API code

1import asyncio
2import base64
3from sarvamai import AsyncSarvamAI, AudioOutput
4import websockets
5
6async def tts_stream():
7 client = AsyncSarvamAI(api_subscription_key="YOUR_SARVAM_API_KEY")
8
9 async with client.text_to_speech_streaming.connect(model="bulbul:v3") as ws:
10 await ws.configure(
11 target_language_code="hi-IN",
12 speaker="shubh",
13 output_audio_bitrate= "128k"
14 )
15 print("Sent configuration")
16
17 text = (
18 "भारत की संस्कृति विश्व की सबसे प्राचीन और समृद्ध संस्कृतियों में से एक है।"
19 "यह विविधता, सहिष्णुता और परंपराओं का अद्भुत संगम है, "
20 "जिसमें विभिन्न धर्म, भाषाएं, त्योहार, संगीत, नृत्य, वास्तुकला और जीवनशैली शामिल हैं।"
21 )
22
23 await ws.convert(text)
24 print("Sent text message")
25
26 await ws.flush()
27 print("Flushed buffer")
28
29 chunk_count = 0
30 with open("output.mp3", "wb") as f:
31 async for message in ws:
32 if isinstance(message, AudioOutput):
33 chunk_count += 1
34 audio_chunk = base64.b64decode(message.data.audio)
35 f.write(audio_chunk)
36 f.flush()
37
38 print(f"All {chunk_count} chunks saved to output.mp3")
39 print("Audio generation complete")
40
41
42 if hasattr(ws, "_websocket") and not ws._websocket.closed:
43 await ws._websocket.close()
44 print("WebSocket connection closed.")
45
46
47if __name__ == "__main__":
48 asyncio.run(tts_stream())
49
50# --- Notebook/Colab usage ---
51# await tts_stream()