For AI agents: a documentation index is available at the root level at /llms.txt and /llms-full.txt. Append /llms.txt to any URL for a page-level index, or .md for the markdown version of any page.
CommunityAPI StatusAPI PricingSign Up
DocumentationAPI ReferencesCookbookIntegrationDeveloper Tools
DocumentationAPI ReferencesCookbookIntegrationDeveloper Tools
  • Getting Started
    • Welcome
    • Quickstart
    • SDKs & Libraries
    • Building for Indian Languages
    • Models
    • Credits & Rate Limits
    • Errors & Troubleshooting
    • Talk to us
    • Pricing
    • Changelog
  • API Guides & Tutorials
      • Overview
      • Which API to Use
      • Rest API
      • Pronunciation Dictionary
      • Best Practices
LogoLogo
CommunityAPI StatusAPI PricingSign Up
On this page
  • API Types
  • Supported Audio Formats & MIME Types
  • Next Steps
API Guides & TutorialsText to Speech

Text-to-Speech Overview

||View as Markdown|
Was this page helpful?
Previous

Which Text-to-Speech API to Use

Next
Built with

Sarvam AI offers a powerful text-to-speech model: Bulbul V3 — advanced TTS with 30+ voices and high-quality natural speech synthesis for Indian languages.

Bulbul V3

Advanced text-to-speech model with 30+ voices and high-quality natural speech synthesis for Indian languages.

API Types

Available API types: REST API for quick conversions up to 2500 characters, and Streaming API for real-time audio via HTTP stream or WebSocket.

REST API

Generate speech for short text with immediate response. Best for quick conversions up to 2500 characters.

Streaming API

Stream audio in real time — via a single HTTP POST for simple pipelines, or a persistent WebSocket connection for interactive voice agents.

Not sure which one fits your latency and interactivity needs? See Which Text-to-Speech API to Use for a side-by-side comparison of REST, HTTP streaming, and WebSocket.

Supported Audio Formats & MIME Types

The TTS API supports over 8 major audio formats and MIME type variants. Supported formats and MIME types are listed below:

Format GroupSupported MIME Types
MP3 Variantsmp3
WAV Variantswav
AAC Variantsaac
OPUS Formatopus
FLAC Variants (Lossless)flac
PCM LINEAR16pcm
MULAW (μ-law)mulaw
ALAW (A-law)alaw

Experience the voices: Head to dashboard.sarvam.ai to explore 30+ speaker voices, test different languages, and generate audio samples with custom input.

Next Steps

1

Choose Your API

Select the appropriate API type based on your use case.

2

Get API Key

Sign up and get your API key from the dashboard.

3

Go Live

Deploy your integration and monitor usage in the dashboard.

Need help choosing the right API? Contact us on discord for guidance.