Models
Sarvam AI offers a comprehensive suite of models designed specifically for Indian languages and use cases. From speech processing to text generation, our models are optimized for multilingual, code-mixed, and culturally aware AI applications.
Model Selection Guide
High-accuracy transcription for 11 Indian languages, supports real-time and batch processing.
Direct speech-to-English translation with domain-aware optimization for telephony and built-in language detection.
Real-time, natural-sounding speech in 11 Indian languages with voice control, prosody tuning, and multi-rate audio output.
Advanced bidirectional translation with style options, script control, and automatic language detection for 11 Indian languages.
Covers all 22 Indian languages with formal translation, smart pre-processing, and support for multiple numeral formats.
Multilingual conversational AI with fast reasoning, code/math handling, and Wikipedia-powered knowledge for Indian languages.
Common Supported Languages
Hindi (hi-IN
), Bengali (bn-IN
), Tamil (ta-IN
), Telugu (te-IN
), Gujarati (gu-IN
), Kannada (kn-IN
), Malayalam (ml-IN
), Marathi (mr-IN
), Punjabi (pa-IN
), Odia (od-IN
), English (en-IN
)
Additional Languages (Only for Sarvam Translate)
Assamese (as-IN
), Bodo (brx-IN
), Dogri (doi-IN
), Kashmiri (ks-IN
), Konkani (kok-IN
), Maithili (mai-IN
), Manipuri (mni-IN
), Nepali (ne-IN
), Sanskrit (sa-IN
), Santali (sat-IN
), Sindhi (sd-IN
), Urdu (ur-IN
)
Model Language Support Summary
Speech Models
Saarika
High-accuracy speech-to-text for Indian languages
Speech-to-Text model for Indian languages. Convert speech to text with high accuracy for multiple Indian languages.
• 11 Indian Languages
• Real-time & Batch APIs
• Speaker Diarization
• Automatic Language Detection
Best for: Call center transcription, meeting notes, voice assistants, accessibility applications
Saaras
Domain-optimized speech-to-English translation
Speech-to-Text Translation model for Indian languages. Convert speech to text and translate between Indian languages in one go.
• Direct Speech → English
• Domain-Aware Translation
• Telephony Optimized
• Entity Preservation
Best for: Customer support, international calls, content localization, multilingual meetings
Bulbul
Natural text-to-speech with voice control
Text-to-Speech model for Indian languages. Convert text to natural-sounding speech in multiple Indian languages.
• Natural Prosody
• Voice Control (Pitch/Pace/Loudness)
• Multiple Sample Rates
• Text Preprocessing
Best for: Voice assistants, audiobooks, accessibility tools, IVR systems, content creation
Text Models
Mayura
Advanced multilingual translation (11 languages)
Translation model for Indian languages. Translate text between different Indian languages with high accuracy.
• Bidirectional Translation
• Multiple Translation Styles
• Script Control
• Code-Mixed Support
Best for: Content localization, social media, customer communication, educational content
Sarvam-Translate
Comprehensive translation for all 22 official Indian languages
Translation model that supports all 22 official Indian languages and is optimized for structured, long-form text.
• 22 Official Indian Languages
• Structured Text Optimization
• Long-form Content
• Government & Legal Documents
Best for: Government documents, legal translation, comprehensive localization, official communications
Chat Models
Sarvam-M
Multilingual chat model with reasoning capabilities
Chat completion model with 24B open weights, supporting multilingual, hybrid reasoning in a text-only format.
• 24B Parameters
• Hybrid Reasoning (Think/Non-Think)
• 8192 Token Context
• Wikipedia Grounding
Best for: Conversational AI, customer support, educational applications, reasoning tasks
Use Case Examples
Voice Assistant
Content Localization
Call Center Analytics
Educational Platform
Build a multilingual voice assistant
- Speech Input: Use Saarika to convert user speech to text
- Understanding: Process with Sarvam-M for intelligent responses
- Speech Output: Convert responses to speech with Bulbul
Perfect for customer service, smart home devices, and accessibility applications.
Next Steps
Get your API key and make your first request in 5 minutes
Step-by-step guides with code examples and use cases
Monitor usage, manage API keys, and view analytics
Need help choosing? Our models are designed to work together. Most applications benefit from combining multiple models - for example, using Saarika for speech input, Sarvam-M for processing, and Bulbul for speech output creates a complete conversational AI system.