Sarvam-M

🧠 Sarvam-M (Reasoning LLM)

Multilingual, hybrid-reasoning, text-only model built on Mistral-Small.

Post-trained for superior reasoning and Indic language support.

Performance Improvements:

  • +20% on Indian language benchmarks
  • +21.6% on math benchmarks
  • +17.6% on programming benchmarks
  • +86% on romanized Indian language GSM-8K benchmarks

Key Features:

  • Hybrid Thinking Mode: Switch between ā€œthinkā€ (reasoning, coding, math) and ā€œnon-thinkā€ (fast conversations).
  • Advanced Indic Skills: Authentically trained on Indian languages & cultural contexts.
  • Superior Reasoning: Outperforms similar-sized models on coding & math.
  • Seamless Chat: Works across Indic scripts & romanized text.

Key Features

Strong Indian Language Support

Trained in 11 major Indic languages with support for native script, Romanised, and code-mixed inputs, tailored for everyday and formal Indian use cases.

Hybrid Reasoning Model

Supports both ā€œthinkā€ and ā€œnon-thinkā€ modes, excelling in math, logic, and code-related tasks with special training for improved reasoning and direct answers.

Efficient and Fast Inference

Uses compression to make responses faster, works well even on lower-cost hardware setups, and can handle many users at once without slowing down.

Knowledge Augmentation with Wikipedia

Looks up facts from Wikipedia when needed, gives more accurate answers for current or detailed topics, and works across English and Indian languages.

Superior Performance

Outperforms leading models including Mistral 3 Small, Gemma 3, and Llama models across Indian language benchmarks.

Context-Aware Processing

Maintains context across long conversations with 8192 token context length and intelligent reasoning capabilities.

Performance Benchmarks

Indic Vibe Check Benchmark

LanguageSarvam M (24B)Mistral 3 Small (24B)Gemma 3 (27B)Llama 4 Scout (17B/109B)Llama 3 (70B)
Bengali8.177.627.297.597.01
English8.358.327.858.178.20
Gujarati8.217.537.527.676.74
Hindi8.308.107.827.697.53
Kannada7.987.537.537.686.59
Malayalam8.197.507.467.686.96
Marathi8.177.387.487.977.12
Oriya7.823.436.526.465.68
Punjabi8.157.497.487.636.96
Tamil7.927.407.557.306.56
Telugu8.057.396.957.526.87
Average8.127.247.407.586.93

Model Specifications

Key Considerations
  • Maximum context length: 8192 tokens
  • Temperature range: 0 to 2

    • Non-thinking mode: 0.2 (recommended)
    • Thinking mode: 0.5 (recommended)
  • Top-p range: 0 to 1
  • Reasoning effort options: low, medium, high

    • Setting any value enables thinking mode
    • Higher values increase reasoning depth
  • Enable wiki_grounding for factual queries

Key Capabilities

Simple, one-turn interaction where the user asks a question and the model replies with a single, direct response.

1from sarvamai import SarvamAI
2
3client = SarvamAI(
4 api_subscription_key="YOUR_SARVAM_API_KEY",
5)
6
7response = client.chat.completions(
8 messages=[
9 {"role": "user", "content": "Why is India called a land of diverse landscapes?"}
10 ],
11 temperature=0.5,
12 top_p=1,
13 max_tokens=1000,
14)
15
16print(response)

Next Steps