Bulbul
Bulbul v3 is our latest text-to-speech model, specifically designed for Indian languages and accents. It features improved audio quality, 30+ speaker voices, and supports up to 2500 characters per request.
Key Features
Wide selection of natural-sounding voices including Shubh, Aditya, Ritu, Simran, Anand, Roopa, Priya, and more.
Support for up to 2500 characters per request for longer content generation.
Multiple sample rates: 8kHz, 16kHz, 22.05kHz, 24kHz (default). Higher rates (32kHz, 44.1kHz, 48kHz) available in bulbul:v3 REST API only.
Support for 11 Indian languages with BCP-47 codes. The target language code is primarily used by the pre-TTS text normalization model.
Human-like speech patterns with natural intonation and emotional expression.
Adjustable speech speed from 0.5x to 2.0x for customized delivery.
Language Support
Bulbul v3 supports the following Indian languages:
Hindi (hi-IN), Bengali (bn-IN), Tamil (ta-IN), Telugu (te-IN), Gujarati (gu-IN), Kannada (kn-IN), Malayalam (ml-IN), Marathi (mr-IN), Punjabi (pa-IN), Odia (od-IN), English (en-IN)
Available Speakers
Bulbul v3 offers 30+ speaker voices:
Speakers: Shubh (default), Aditya, Ritu, Priya, Neha, Rahul, Pooja, Rohan, Simran, Kavya, Amit, Dev, Ishita, Shreya, Ratan, Varun, Manan, Sumit, Roopa, Kabir, Aayan, Ashutosh, Advait, Amelia, Sophia, Anand, Tanya, Tarun, Sunny, Mani, Gokul, Vijay, Shruti, Suhani, Mohit, Kavitha, Rehan, Soham, Rupali
Use the speaker parameter to select specific voices for your use case. Each speaker has unique characteristics suitable for different applications.
Key Capabilities
Basic Usage
Speaker Selection
Sample Rate Control
Convert text to speech with default settings. This is the simplest way to get started with Bulbul v3.