Text-to-Speech Overview
Sarvam AI offers a powerful text-to-speech model:
API Types
REST API
Generate speech for short text with immediate response. Best for quick conversions up to 1000 characters.
Streaming API
Stream long or live text into speech with low latency. Ideal for real-time playback, WebSocket-based async use, and efficient resource handling.
Supported Audio Formats & MIME Types
The TTS API supports over 8 major audio formats and MIME type variants. Supported formats and MIME types are listed below:
Experience the voices: Head to dashboard.sarvam.ai to explore 30+ speaker voices, test different languages, and generate audio samples with custom input.
Next Steps
Need help choosing the right API? Contact us on discord for guidance.