> For clean Markdown of any page, append `.md` to the page URL.
> For a complete documentation index, see https://docs.sarvam.ai/llms.txt.
> For full documentation content in one file, see https://docs.sarvam.ai/llms-full.txt.
> For AI client integration (Claude Code, Cursor, etc.), connect to the MCP server at https://docs.sarvam.ai/_mcp/server.

# Bulbul

> Bulbul v3 - High-quality multilingual text-to-speech model for Indian languages with natural prosody and 30+ speaker voices.

Bulbul v3 is our latest text-to-speech model, specifically designed for Indian languages and accents. It features improved audio quality, 30+ speaker voices, and supports up to 2500 characters per request.

## Key Features

Wide selection of natural-sounding voices including Shubh, Aditya, Ritu, Simran, Anand, Roopa, Priya, and more.

Support for up to 2500 characters per request for longer content generation.

Multiple sample rates: 8kHz, 16kHz, 22.05kHz, 24kHz (default). Higher rates (32kHz, 44.1kHz, 48kHz) available in **bulbul:v3 REST API only**.

Support for 11 languages (10 Indian + English) with BCP-47 codes. The target language code is primarily used by the pre-TTS text normalization model.

Human-like speech patterns with natural intonation and emotional expression.

Adjustable speech speed from 0.5x to 2.0x for customized delivery.

## Language Support

Bulbul v3 supports the following Indian languages:

Hindi (`hi-IN`), Bengali (`bn-IN`), Tamil (`ta-IN`), Telugu (`te-IN`), Gujarati (`gu-IN`), Kannada (`kn-IN`), Malayalam (`ml-IN`), Marathi (`mr-IN`), Punjabi (`pa-IN`), Odia (`od-IN`), English (`en-IN`)

## Available Speakers

Bulbul v3 offers 30+ speaker voices:

**Speakers:** Shubh (default), Aditya, Ritu, Priya, Neha, Rahul, Pooja, Rohan, Simran, Kavya, Amit, Dev, Ishita, Shreya, Ratan, Varun, Manan, Sumit, Roopa, Kabir, Aayan, Ashutosh, Advait, Anand, Tanya, Tarun, Sunny, Mani, Gokul, Vijay, Shruti, Suhani, Mohit, Kavitha, Rehan, Soham, Rupali

Use the `speaker` parameter to select specific voices for your use case. Each speaker has unique characteristics suitable for different applications.

## Key Capabilities

Convert text to speech with default settings. This is the simplest way to get started with Bulbul v3.

```python
from sarvamai import SarvamAI
from sarvamai.play import play, save

client = SarvamAI(
    api_subscription_key="YOUR_SARVAM_API_KEY"
)

response = client.text_to_speech.convert(
    text="Hello, how are you today?",
    target_language_code="en-IN",
    model="bulbul:v3"
)

# Play the audio
play(response)

# Save the response to a file
save(response, "output.wav")
```

```javascript
import { SarvamAIClient } from "sarvamai";

const client = new SarvamAIClient({
  apiSubscriptionKey: 'YOUR_SARVAM_API_KEY'
});

async function main() {
  const response = await client.textToSpeech.convert({
    text: 'Hello, how are you today?',
    target_language_code: 'en-IN',
    model: 'bulbul:v3'
  });

  console.log(response);
}

main();
```

```bash
curl -X POST https://api.sarvam.ai/text-to-speech \
  -H "api-subscription-key: <YOUR_SARVAM_API_KEY>" \
  -H "Content-Type: application/json" \
  -d '{
    "text": "Welcome to Sarvam AI!",
    "target_language_code": "en-IN",
    "model": "bulbul:v3",
    "speaker": "shubh"
  }'
```

Choose from 30+ speaker voices to find the perfect voice for your application.

```python
from sarvamai import SarvamAI

client = SarvamAI(
    api_subscription_key="YOUR_SARVAM_API_KEY"
)

# Select a specific speaker
response = client.text_to_speech.convert(
    text="Hello, how are you today?",
    target_language_code="en-IN",
    model="bulbul:v3",
    speaker="priya"  # Options: aditya, ritu, priya, neha, rahul, pooja, rohan, simran, kavya, amit, dev, ishita, shreya, ratan, varun, manan, sumit, roopa, kabir, aayan, shubh, ashutosh, advait, anand, tanya, tarun, sunny, mani, gokul, vijay, shruti, suhani, mohit, kavitha, rehan, soham, rupali
)

print(response)
```

```javascript
import { SarvamAIClient } from "sarvamai";

const client = new SarvamAIClient({
  apiSubscriptionKey: 'YOUR_SARVAM_API_KEY'
});

async function main() {
  // Select a specific speaker
  const response = await client.textToSpeech.convert({
    text: 'Hello, how are you today?',
    target_language_code: 'en-IN',
    model: 'bulbul:v3',
    speaker: 'priya'
  });

  console.log(response);
}

main();
```

```bash
curl -X POST https://api.sarvam.ai/text-to-speech \
  -H "api-subscription-key: <YOUR_SARVAM_API_KEY>" \
  -H "Content-Type: application/json" \
  -d '{
    "text": "Welcome to Sarvam AI!",
    "target_language_code": "en-IN",
    "model": "bulbul:v3",
    "speaker": "priya"
  }'
```

Choose the audio quality that best fits your needs. Default sample rate for v3 is 24kHz (premium quality).

```python
from sarvamai import SarvamAI

client = SarvamAI(
    api_subscription_key="YOUR_SARVAM_API_KEY"
)

# Control audio quality with sample rate
response = client.text_to_speech.convert(
    text="Hello, how are you today?",
    target_language_code="en-IN",
    model="bulbul:v3",
    speech_sample_rate=48000  # Options: 8000, 16000, 22050, 24000. v3 REST API only: 32000, 44100, 48000
)

print(response)
```

```javascript
import { SarvamAIClient } from "sarvamai";

const client = new SarvamAIClient({
  apiSubscriptionKey: 'YOUR_SARVAM_API_KEY'
});

async function main() {
  // Control audio quality with sample rate
  const response = await client.textToSpeech.convert({
    text: 'Hello, how are you today?',
    target_language_code: 'en-IN',
    model: 'bulbul:v3',
    speech_sample_rate: 48000  // Options: 8000, 16000, 22050, 24000. v3 REST API only: 32000, 44100, 48000
  });

  console.log(response);
}

main();
```

```bash
curl -X POST https://api.sarvam.ai/text-to-speech \
  -H "api-subscription-key: <YOUR_SARVAM_API_KEY>" \
  -H "Content-Type: application/json" \
  -d '{
    "text": "Welcome to Sarvam AI!",
    "target_language_code": "en-IN",
    "model": "bulbul:v3",
    "speech_sample_rate": 48000
  }'
```

Sample rate options:

* 8000 Hz: Basic telephony quality
* 16000 Hz: Good quality voice
* 22050 Hz: High-quality audio
* 24000 Hz: Premium audio quality (default for v3)
* 32000 Hz: Broadcast quality (**bulbul:v3 REST API only**)
* 44100 Hz: CD quality audio (**bulbul:v3 REST API only**)
* 48000 Hz: Professional/Studio quality (**bulbul:v3 REST API only**)

**Note:** Sample rates above 24000 Hz (32kHz, 44kHz, 48kHz) are available **only with bulbul:v3** via the **REST API** — not in streaming mode.

## Limits

| Limit                             | Value                                                                                                                     |
| --------------------------------- | ------------------------------------------------------------------------------------------------------------------------- |
| Max characters per request (REST) | 2,500                                                                                                                     |
| `pace`                            | 0.5–2.0 (`bulbul:v3`) / 0.3–3.0 (`bulbul:v2`)                                                                             |
| `pitch`                           | -1.0 to 1.0; suitable range -0.75 to 0.75 (`bulbul:v2` only)                                                              |
| `loudness`                        | 0.1–3.0 (`bulbul:v2` only)                                                                                                |
| `speech_sample_rate`              | 8000 / 16000 / 22050 / 24000 Hz; plus 32000 / 44100 / 48000 Hz (REST and WebSocket only). Default: 24000 (v3), 22050 (v2) |
| Rate limits                       | See [Rate Limits](/api-reference-docs/ratelimits)                                                                         |

## Known Limitations

| Limitation                                            | Detail                                                                                                                          | Workaround                                                                        |
| ----------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------- | --------------------------------------------------------------------------------- |
| **No SSML support**                                   | Bulbul does not support SSML tags for fine-grained prosody control                                                              | Use `pace` for coarse control; split text at natural pause points                 |
| **Romanised Indic input degrades quality**            | Transliterated input (e.g., `"Aapka order confirm ho gaya hai"`) significantly reduces output quality                           | Always use native script for Indic words (e.g., `"आपका order confirm हो गया है"`) |
| **High sample rates not available on HTTP streaming** | 32 kHz, 44.1 kHz, and 48 kHz are available via the REST and WebSocket APIs with `bulbul:v3`; HTTP streaming is capped at 24 kHz | Use ≤ 24 kHz for HTTP streaming                                                   |

## Next Steps

Learn how to integrate the Bulbul v3 API within your application.

Complete API documentation for text to speech endpoints.

Step-by-step tutorial for text-to-speech implementation.