Credits & Rate Limits
Credits
Sarvam offers ₹1,000 worth of free credits for every user on signup. These credits can be used across any of our APIs — explore, prototype, and build without upfront cost.
Credits are universal and never expire. Once exhausted, add more credits or upgrade your plan from the Sarvam Dashboard.
How Rate Limits Work
Rate limits restrict the number of API requests your account can make within a given time window. Key points:
- Per-account enforcement — limits apply to your account as a whole, not individual API keys. All keys share the same rate limit pool.
- Continuous replenishment — capacity refills steadily over the window period rather than resetting all at once (token bucket model). Short bursts may still trigger limits.
- Per-API granularity — each API has its own independent concurrency limits. WebSocket, Vision, and LLM APIs have different limits from standard REST APIs — check your specific API below.
Concurrency Modes
Each API enforces limits across three concurrency modes:
Per-API Rate Limits by Plan
Rate limits vary significantly by API type and plan. Review the limits for each API below before building your integration.
Speech to Text
Real-time REST (stt-rt)
WebSocket Streaming (stt-ws)
Batch (stt-batch)
For batch endpoints, implement a minimum 5ms delay between consecutive status polling requests to avoid hitting rate limits unnecessarily.
Text to Speech
Real-time REST (tts-rt)
For bulbul:v3 model specifically, Starter provisioned limit is 30 req/min (burst: 50). Pro and Business limits are the same as the default above.
WebSocket Streaming (tts-ws)
For bulbul:v3 model specifically, Starter provisioned limit is 30 concurrent (burst: 50). Pro and Business limits are the same as the default above.
Translation & Text Services
Translate (ms-ts)
Chat Completion (LLM)
Default models (ms-llm)
Sarvam-30B & Sarvam-105B models
These large models have lower limits due to their compute requirements.
Applies to: sarvam-30b, sarvam-30b-16k, sarvam-105b, sarvam-105b-32k
Vision
Vision API limits are uniform across all plans (Starter, Pro, and Business). Upgrading your plan does not increase Vision limits.
Document Intelligence (vis-doc-dig)
Vision Real-time (vis-rt)
Plan Overview
Concurrency limits are measured per account, not per API key. All keys under an account share the same limit pool. Your current limits are visible on the Dashboard → Rate Limits page.
Upgrading Your Limits
View plans and upgrade directly from the dashboard. Rate limits update instantly.
Need higher rate limits, dedicated infrastructure, or custom SLAs? Talk to our team.
Managing Your Credits
If your credits are exhausted, API requests will return errors. You can add credits at any time — adding credits does not change your plan or rate limits.
- Add Credits — Top up from the Billing page at any time. Credits never expire.
- Upgrade Your Plan — Higher plans include bonus credits and increased rate limits.
- Enterprise — For volume discounts and custom billing arrangements, email developer@sarvam.ai.