> For clean Markdown of any page, append `.md` to the page URL.
> For a complete documentation index, see https://docs.sarvam.ai/llms.txt.
> For full documentation content in one file, see https://docs.sarvam.ai/llms-full.txt.
> For AI client integration (Claude Code, Cursor, etc.), connect to the MCP server at https://docs.sarvam.ai/_mcp/server.

# Sarvam-M (Deprecated)

> Sarvam-M - 24B parameter multilingual, hybrid-reasoning language model with 20% improvement on Indian language benchmarks and Wikipedia grounding support.

**Deprecated Model:** Sarvam-M has been deprecated and is no longer available through the Chat Completions API. Please migrate to [**Sarvam-30B**](/api-reference-docs/getting-started/models/sarvam-30b) or [**Sarvam-105B**](/api-reference-docs/getting-started/models/sarvam-105b) for improved performance across all tasks. The information below is retained for reference only.

**Sarvam-M (Reasoning LLM)**

Multilingual, hybrid-reasoning, text-only model built on Mistral-Small.

Post-trained for superior reasoning and Indic language support.

**Performance Improvements:**

* **+20%** on Indian language benchmarks
* **+21.6%** on math benchmarks
* **+17.6%** on programming benchmarks
* **+86%** on romanized Indian language GSM-8K benchmarks

**Key Features:**

* **Hybrid Thinking Mode:** Switch between "think" (reasoning, coding, math) and "non-think" (fast conversations).
* **Advanced Indic Skills:** Authentically trained on Indian languages & cultural contexts.
* **Superior Reasoning:** Outperforms similar-sized models on coding & math.
* **Seamless Chat:** Works across Indic scripts & romanized text.

## Key Features

Trained in 11 major Indic languages with support for native script, Romanised, and code-mixed inputs, tailored for everyday and formal Indian use cases.

Supports both "think" and "non-think" modes, excelling in math, logic, and code-related tasks with special training for improved reasoning and direct answers.

Uses compression to make responses faster, works well even on lower-cost hardware setups, and can handle many users at once without slowing down.

Looks up facts from Wikipedia when needed, gives more accurate answers for current or detailed topics, and works across English and Indian languages.

Outperforms leading models including Mistral 3 Small, Gemma 3, and Llama models across Indian language benchmarks.

Maintains context across long conversations with 8192 token context length and intelligent reasoning capabilities.

## Learn More

For detailed information on performance benchmarks and capabilities, visit [our blog](https://www.sarvam.ai/blogs/sarvam-m).

## Model Specifications

<ul>
  <li>
    Maximum context length: 8192 tokens
  </li>

  <li>
    Temperature range: 0 to 2

    <ul>
      <li>
        Non-thinking mode: 0.2 (recommended)
      </li>

      <li>
        Thinking mode: 0.5 (recommended)
      </li>
    </ul>
  </li>

  <li>
    Top-p range: 0 to 1
  </li>

  <li>
    Reasoning effort options: low, medium, high

    <ul>
      <li>
        Setting any value enables thinking mode
      </li>

      <li>
        Higher values increase reasoning depth
      </li>
    </ul>
  </li>

  <li>
    Enable wiki_grounding for factual queries
  </li>
</ul>

## Key Capabilities

Simple, one-turn interaction where the user asks a question and the model replies with a single, direct response.

```python
from sarvamai import SarvamAI

client = SarvamAI(
    api_subscription_key="YOUR_SARVAM_API_KEY",
)

response = client.chat.completions(
    model="sarvam-m",
    messages=[
{"role": "user", "content": "Why is India called a land of diverse landscapes?"}
    ],
    temperature=0.5,
    top_p=1,
    max_tokens=1000,
)

print(response)
```

```javascript
import { SarvamAIClient } from "sarvamai";

// Initialize the SarvamAI client with your API key
const client = new SarvamAIClient({
    apiSubscriptionKey: "YOUR_SARVAM_API_KEY",
});

async function main() {
    const response = await client.chat.completions({
        model: "sarvam-m",
        messages: [
            {
                role: "user",
                content: "Why is India called a land of diverse landscapes?",
            },
        ],
        temperature: 0.5,
        top_p: 1,
        max_tokens: 1000,
    });

    // Log the assistant's reply
    console.log(response.choices[0].message.content);
}

main();
```

```bash
curl -X POST https://api.sarvam.ai/v1/chat/completions \
  -H "Authorization: Bearer $SARVAM_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "messages": [
      {"role": "user", "content": "Why is India called a land of diverse landscapes?"}
    ],
    "model": "sarvam-m",
    "temperature": 0.5,
    "top_p": 1,           
    "max_tokens": 1000
  }'
```

Involves multiple exchanges between the system, user, and assistant, where context is maintained across all turns for coherent and relevant responses.

```python
from sarvamai import SarvamAI

client = SarvamAI(
    api_subscription_key="YOUR_SARVAM_API_KEY",
)

response = client.chat.completions(
    model="sarvam-m",
    messages=[
{"role": "system", "content": "You are a travel expert specializing in Indian destinations."},
{"role": "user", "content": "Suggest a good place to visit in South India."},
{"role": "assistant", "content": "You can visit Munnar in Kerala. It's known for its tea plantations and cool climate."},
{"role": "user", "content": "What is the best time to visit Munnar?"}
    ],
    temperature=0.7,
    top_p=1,
    max_tokens=1000
)

print(response.choices[0].message.content)
```

```javascript
import { SarvamAIClient } from "sarvamai";

const client = new SarvamAIClient({
    apiSubscriptionKey: "YOUR_SARVAM_API_KEY"
});

async function main() {
    const response = await client.chat.completions({
        model: "sarvam-m",
        messages: [
            { role: "system", content: "You are a travel expert specializing in Indian destinations." },
            { role: "user", content: "Suggest a good place to visit in South India." },
            { role: "assistant", content: "You can visit Munnar in Kerala. It's known for its tea plantations and cool climate." },
            { role: "user", content: "What is the best time to visit Munnar?" }
        ],
        temperature: 0.7,
        top_p: 1,
        max_tokens: 1000
    });
    
    console.log(response.choices[0].message.content);
}

main();
```

```bash
curl -X POST https://api.sarvam.ai/v1/chat/completions \
  -H "Authorization: Bearer $SARVAM_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "messages": [
      {"role": "system", "content": "You are a travel expert specializing in Indian destinations."},
      {"role": "user", "content": "Suggest a good place to visit in South India."},
      {"role": "assistant", "content": "You can visit Munnar in Kerala. It is known for its tea plantations and cool climate."},
      {"role": "user", "content": "What is the best time to visit Munnar?"}
    ],
    "model": "sarvam-m",
    "temperature": 0.7,
    "top_p": 1,           
    "max_tokens": 1000
  }'
```

Wiki grounding allows the Sarvam-M model to fetch and use information from Wikipedia to give more accurate and fact-based answers.

```python
from sarvamai import SarvamAI

client = SarvamAI(
    api_subscription_key="YOUR_SARVAM_API_KEY",
)

response = client.chat.completions(
    model="sarvam-m",
    messages=[
        {"role": "user", "content": "What is the history of the Taj Mahal?"}
    ],
    temperature=0.2,
    top_p=1,
    wiki_grounding=True
)

print(response.choices[0].message.content)
```

```javascript
import { SarvamAIClient } from "sarvamai";

const client = new SarvamAIClient({
    apiSubscriptionKey: "YOUR_SARVAM_API_KEY"
});

async function main() {
    const response = await client.chat.completions({
        model: "sarvam-m",
        messages: [
            { role: "user", content: "What is the history of the Taj Mahal?" }
        ],
        temperature: 0.2,
        topP: 1,
        wiki_grounding: true
    });
    
    console.log(response.choices[0].message.content);
}

main();
```

```bash
curl -X POST https://api.sarvam.ai/v1/chat/completions \
  -H "Authorization: Bearer $SARVAM_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "messages": [
      {"role": "user", "content": "What is the history of the Taj Mahal?"}
    ],
    "model": "sarvam-m",
    "temperature": 0.2,
    "top_p": 1,
    "wiki_grounding": true
  }'
```

## Next Steps

Learn how to integrate chat completion into your application.

Complete API documentation for chat completion endpoints.

Step-by-step tutorial for chat completion implementation.