> For clean Markdown of any page, append `.md` to the page URL.
> For a complete documentation index, see https://docs.sarvam.ai/llms.txt.
> For full documentation content in one file, see https://docs.sarvam.ai/llms-full.txt.
> For AI client integration (Claude Code, Cursor, etc.), connect to the MCP server at https://docs.sarvam.ai/_mcp/server.

# How to specify language codes

> Use BCP-47 language codes for accurate speech-to-text transcription with Saaras v3.

The `language_code` parameter tells the STT model which language to expect in the audio. Using the correct language code improves transcription accuracy.

### Supported Languages (Saaras v3)

Saaras v3 supports 22 Indian languages with BCP-47 format codes:

| Language  | Code    |   | Language | Code     |
| --------- | ------- | - | -------- | -------- |
| Hindi     | `hi-IN` |   | Assamese | `as-IN`  |
| Bengali   | `bn-IN` |   | Urdu     | `ur-IN`  |
| Kannada   | `kn-IN` |   | Nepali   | `ne-IN`  |
| Malayalam | `ml-IN` |   | Konkani  | `kok-IN` |
| Marathi   | `mr-IN` |   | Kashmiri | `ks-IN`  |
| Odia      | `od-IN` |   | Sindhi   | `sd-IN`  |
| Punjabi   | `pa-IN` |   | Sanskrit | `sa-IN`  |
| Tamil     | `ta-IN` |   | Santali  | `sat-IN` |
| Telugu    | `te-IN` |   | Manipuri | `mni-IN` |
| English   | `en-IN` |   | Bodo     | `brx-IN` |
| Gujarati  | `gu-IN` |   | Maithili | `mai-IN` |
|           |         |   | Dogri    | `doi-IN` |

### Automatic Language Detection

To enable automatic language detection, pass `unknown` as the `language_code` parameter. The model will detect the language from the audio.

**Best Practice:** Always specify the language code when you know the language of the audio. This improves accuracy and reduces processing time. Use `unknown` only when the language is truly unknown.

### Example Code

```python
from sarvamai import SarvamAI

client = SarvamAI(api_subscription_key="YOUR_SARVAM_API_KEY")

# Specify language for better accuracy
response = client.speech_to_text.transcribe(
    file=open("audio.wav", "rb"),
    model="saaras:v3",
    language_code="ta-IN",  # Tamil
    mode="transcribe"
)

print(response.transcript)
```

```javascript
import { SarvamAIClient } from "sarvamai";
import fs from 'fs';

const client = new SarvamAIClient({
    apiSubscriptionKey: "YOUR_SARVAM_API_KEY"
});

const audioFile = fs.createReadStream("audio.wav");

const response = await client.speechToText.transcribe({
    file: audioFile,
    model: "saaras:v3",
    language_code: "ta-IN",  // Tamil
    mode: "transcribe"
});

console.log(response.transcript);
```

```bash
curl -X POST https://api.sarvam.ai/speech-to-text \
  -H "api-subscription-key: <YOUR_SARVAM_API_KEY>" \
  -H "Content-Type: multipart/form-data" \
  -F model="saaras:v3" \
  -F language_code="ta-IN" \
  -F mode="transcribe" \
  -F file=@audio.wav
```

```python
from sarvamai import SarvamAI

client = SarvamAI(api_subscription_key="YOUR_SARVAM_API_KEY")

# Use 'unknown' for automatic language detection
response = client.speech_to_text.transcribe(
    file=open("audio.wav", "rb"),
    model="saaras:v3",
    language_code="unknown",  # Auto-detect language
    mode="transcribe"
)

print(response.transcript)
print(response.language_code)  # Detected language
```

```javascript
import { SarvamAIClient } from "sarvamai";
import fs from 'fs';

const client = new SarvamAIClient({
    apiSubscriptionKey: "YOUR_SARVAM_API_KEY"
});

const audioFile = fs.createReadStream("audio.wav");

// Use 'unknown' for automatic language detection
const response = await client.speechToText.transcribe({
    file: audioFile,
    model: "saaras:v3",
    language_code: "unknown",  // Auto-detect language
    mode: "transcribe"
});

console.log(response.transcript);
console.log(response.language_code);  // Detected language
```

```bash
curl -X POST https://api.sarvam.ai/speech-to-text \
  -H "api-subscription-key: <YOUR_SARVAM_API_KEY>" \
  -H "Content-Type: multipart/form-data" \
  -F model="saaras:v3" \
  -F language_code="unknown" \
  -F mode="transcribe" \
  -F file=@audio.wav
```