> For clean Markdown of any page, append `.md` to the page URL.
> For a complete documentation index, see https://docs.sarvam.ai/llms.txt.
> For full documentation content in one file, see https://docs.sarvam.ai/llms-full.txt.
> For AI client integration (Claude Code, Cursor, etc.), connect to the MCP server at https://docs.sarvam.ai/_mcp/server.

# Credits & Rate Limits

> Understand Sarvam AI rate limits by plan tier, per-API concurrency limits, and how to handle 429 and 503 errors gracefully. View your current limits on the dashboard.

## Credits

Sarvam offers **₹100 worth of free credits** for every user on signup. These credits can be used across any of our APIs — explore, prototype, and build without upfront cost.

Credits are universal and never expire. Once exhausted, add more credits or upgrade your plan from the [Sarvam Dashboard](https://dashboard.sarvam.ai/billing).

***

## How Rate Limits Work

Rate limits restrict the number of API requests your account can make within a given time window. Key points:

* **Per-account enforcement** — limits apply to your account as a whole, not individual API keys. All keys share the same rate limit pool.
* **Continuous replenishment** — capacity refills steadily over the window period rather than resetting all at once (token bucket model).
* **Per-API granularity** — each API has its own independent rate limits. WebSocket, Vision, and LLM APIs have different limits from standard REST APIs — check your specific API below.

***

## Per-API Rate Limits by Plan

Rate limits vary significantly by API type and plan. Review the limits for each API below before building your integration.

### Speech to Text

#### Real-time REST (`stt-rt`)

|                | Starter    | Pro         | Business      |
| -------------- | ---------- | ----------- | ------------- |
| **Rate Limit** | 60 req/min | 100 req/min | 4,000 req/min |

#### WebSocket Streaming (`stt-ws`)

|                | Starter       | Pro            | Business       |
| -------------- | ------------- | -------------- | -------------- |
| **Rate Limit** | 20 concurrent | 100 concurrent | 100 concurrent |

#### Batch (`stt-batch`)

|                | Starter    | Pro         | Business    |
| -------------- | ---------- | ----------- | ----------- |
| **Rate Limit** | 20 req/min | 100 req/min | 500 req/min |

For batch endpoints, implement a minimum **5ms delay** between consecutive status polling requests to avoid hitting rate limits unnecessarily.

***

### Text to Speech

#### Real-time REST (`tts-rt`)

|                | Starter    | Pro         | Business      |
| -------------- | ---------- | ----------- | ------------- |
| **Rate Limit** | 60 req/min | 200 req/min | 1,000 req/min |

For `bulbul:v3` model specifically, Starter rate limit is **30 req/min**. Pro and Business limits are the same as the default above.

#### WebSocket Streaming (`tts-ws`)

|                | Starter       | Pro            | Business         |
| -------------- | ------------- | -------------- | ---------------- |
| **Rate Limit** | 60 concurrent | 200 concurrent | 1,000 concurrent |

For `bulbul:v3` model specifically, Starter rate limit is **30 concurrent**. Pro and Business limits are the same as the default above.

***

### Translation & Text Services

#### Translate (`ms-ts`)

|                | Starter    | Pro         | Business      |
| -------------- | ---------- | ----------- | ------------- |
| **Rate Limit** | 60 req/min | 200 req/min | 1,000 req/min |

***

### Chat Completion (LLM)

#### Default models (`ms-llm`)

|                | Starter    | Pro         | Business      |
| -------------- | ---------- | ----------- | ------------- |
| **Rate Limit** | 60 req/min | 200 req/min | 1,000 req/min |

#### Sarvam-30B & Sarvam-105B models

These large models have lower limits due to their compute requirements.

|                | Starter    | Pro        | Business    |
| -------------- | ---------- | ---------- | ----------- |
| **Rate Limit** | 40 req/min | 60 req/min | 120 req/min |

Applies to: `sarvam-30b`, `sarvam-105b`

***

### Vision

Vision API limits are **uniform across all plans** (Starter, Pro, and Business). Upgrading your plan does not increase Vision limits.

#### Document Intelligence (`vis-doc-dig`)

|                | Starter    | Pro        | Business   |
| -------------- | ---------- | ---------- | ---------- |
| **Rate Limit** | 10 req/min | 10 req/min | 10 req/min |

#### Vision Real-time (`vis-rt`)

|                | Starter    | Pro        | Business   |
| -------------- | ---------- | ---------- | ---------- |
| **Rate Limit** | 30 req/min | 30 req/min | 30 req/min |

***

## Plan Overview

|                   | Starter               | Pro             | Business             | Enterprise        |
| ----------------- | --------------------- | --------------- | -------------------- | ----------------- |
| **Price**         | Pay as you go         | ₹10,000         | ₹50,000              | Custom            |
| **Bonus Credits** | —                     | ₹100            | ₹7,500               | Custom            |
| **Support**       | Community             | Email           | Email                | Dedicated         |
| **Best For**      | Prototyping & testing | Startups & POCs | Production workloads | Scale deployments |

Rate limits are measured per account, not per API key. All keys under an account share the same limit pool. Your current limits are visible on the [Dashboard → Rate Limits](https://dashboard.sarvam.ai/rate-limits) page.

***

## Upgrading Your Limits

### Check your current limits

Visit the [Dashboard → Rate Limits](https://dashboard.sarvam.ai/rate-limits) to see your exact per-API limits.

### Upgrade your plan

Purchase a higher plan directly from the dashboard. Your rate limits increase immediately after upgrade — no downtime.

### Need custom limits?

For limits beyond Business tier, contact our team for an Enterprise arrangement.

View plans and upgrade directly from the dashboard. Rate limits update instantly.

Need higher rate limits, dedicated infrastructure, or custom SLAs? Talk to our team.

***

## Managing Your Credits

If your credits are exhausted, API requests will return errors. You can add credits at any time — adding credits does **not** change your plan or rate limits.

1. **Add Credits** — Top up from the [Billing page](https://dashboard.sarvam.ai/billing) at any time. Credits never expire.
2. **Upgrade Your Plan** — Higher plans include bonus credits and increased rate limits.
3. **Enterprise** — For volume discounts and custom billing arrangements, email [developer@sarvam.ai](mailto:developer@sarvam.ai?subject=Enterprise%20API%20Access%20Request\&body=Hi%20Sarvam%20Team%2C%0A%0AI'm%20reaching%20out%20from%20%5BYour%20Company%20Name%5D%20to%20explore%20enterprise%20access%20to%20Sarvam%20AI%20APIs.%0A%0ACompany%20Details%3A%0A-%20Company%20Name%3A%20%5BYour%20Company%5D%0A-%20Industry%3A%20%5BYour%20Industry%5D%0A-%20Website%3A%20%5BYour%20Website%5D%0A%0AUse%20Case%3A%0A%5BDescribe%20your%20use%20case%20and%20how%20you%20plan%20to%20use%20our%20APIs%5D%0A%0AAPIs%20of%20Interest%3A%0A%5B%5D%20Speech-to-Text%0A%5B%5D%20Text-to-Speech%0A%5B%5D%20Translation%0A%5B%5D%20Chat%20Completion%0A%5B%5D%20Other%20\(please%20specify\)%0A%0AExpected%20Usage%3A%0A-%20Estimated%20API%20calls%20per%20month%3A%20%5BNumber%5D%0A-%20Expected%20audio%20duration%2Ftext%20volume%3A%20%5BDetails%5D%0A-%20Languages%20needed%3A%20%5BList%20languages%5D%0A%0ASpecific%20Requirements%3A%0A%5BShare%20any%20specific%20features%2C%20SLAs%2C%20or%20support%20needs%5D%0A%0ATimeline%3A%0A%5BWhen%20do%20you%20plan%20to%20start%20testing%2Fdeployment%3F%5D%0A%0ALooking%20forward%20to%20discussing%20this%20further.%0A%0ABest%20regards%2C%0A%5BYour%20Name%5D%0A%5BYour%20Title%5D%0A%5BContact%20Information%5D).