Text-to-Speech Rest API
Synchronous Processing
Convert text to speech with immediate response. Best for quick conversions and testing. Features include:- Instant audio generation
- Multiple voice options
- Support for SSML
- Various audio formats
API Features
Key Features
- Support for code-mixed text
- Multiple speaker voices
- Adjustable speech parameters
- High-quality audio output
Output Format
- Wave file format - Base64 encoded string - Configurable sample rates - Multiple quality options
Speech Parameters
- Pitch control - Speech rate adjustment - Volume control - Language selection
Integration
- Simple REST API
- Multiple language SDKs
- Comprehensive documentation
- Easy-to-follow examples
Model Information
Bulbul v2
Our flagship text-to-speech model designed for Indian languages and accents.
Key Features:
- Natural-sounding speech with human-like prosody
- Multiple voice personalities
- Multi-language and code-mixed text support
- Real-time synthesis capabilities
- Fine-grained control over pitch, pace, and loudness
Language Support
Supports 11 Indian languages with BCP-47 codes:
Supported Languages:
- English (en-IN)
- Hindi (hi-IN)
- Bengali (bn-IN)
- Tamil (ta-IN)
- Telugu (te-IN)
- Kannada (kn-IN)
- Malayalam (ml-IN)
- Marathi (mr-IN)
- Gujarati (gu-IN)
- Punjabi (pa-IN)
- Odia (or-IN)
Bulbul: Our Text to Speech Model
Bulbul is our state-of-the-art text-to-speech model that excels in generating natural-sounding speech with support for multiple Indian languages, code-mixing, and various voice options.
Text to Speech Features
Basic Synthesis
Voice Selection
Advanced Options
Basic Text to Speech Synthesis
Convert text to natural-sounding speech with high quality. Features include:
- Multiple voice options
- Support for Indian languages
- Natural prosody and intonation
- High-quality audio output
Python
JavaScript
cURL
Key Considerations
- Text length limit: 500 characters per input - Maximum 3 texts per API call - For numbers > 4 digits, use commas (e.g., ‘10,000’) - Enable preprocessing for better mixed-language handling