Frequently Asked Questions
Find answers to common questions about our speech-to-text services
REST and Batch APIs support a wide range of audio formats including:
WebSocket/Streaming APIs only support:
For optimal results, we recommend:
Our models support multiple Indian and global languages:
Check our models page for the complete list and specific model capabilities.
The limits vary by API endpoint:
For longer audio files, we recommend:
Accuracy varies based on several factors:
Factors affecting accuracy:
Use our playground to test with your specific audio.
Speaker diarization identifies and labels different speakers in the audio:
Process:
Usage (via Batch API):
Rate limits are applied per account based on your subscription plan:
For batch endpoints, implement a minimum 5ms delay between status polling requests.
View the full Credits & Rate Limits page for details on HTTP headers, error handling, and upgrade paths.
Common errors and solutions:
Solution: Check API key validity and proper configuration. Note: Sarvam returns HTTP 403 (not 401) for invalid/missing API keys — see the Authentication page.
Solution: Implement exponential backoff or upgrade plan
Solution: Check supported formats and requirements
See our error handling guide for more details.
Tips for optimal real-time performance:
View our real-time guide for detailed examples.
Usage is calculated based on:
Example calculation:
Multiple support channels available: