Language Identification API

View as Markdown

Overview

The Language Identification (LID) API identifies the language (e.g., en-IN, hi-IN) and script (e.g., Latin, Devanagari) of the input text. It supports multiple Indian languages and scripts, making it ideal for multilingual text processing.

Detection Types

Single Language

Detect the primary language and script of text input. Example: “Hello, how are you?” → language: en-IN, script: Latn

Auto Detection

Automatic language detection for seamless integration with translation and preprocessing APIs.

Code Examples

1from sarvamai import SarvamAI
2
3client = SarvamAI(
4 api_subscription_key="YOUR_SARVAM_API_KEY"
5)
6
7response = client.text.identify_language(
8 input="Hello, how are you?"
9)
10
11print(f"Request ID: {response.request_id}")
12print(f"Language Code: {response.language_code}") # Output: en-IN
13print(f"Script Code: {response.script_code}") # Output: Latn

Response Format

1{
2 "request_id": "string | null",
3 "language_code": "string | null",
4 "script_code": "string | null"
5}

Supported Languages and Scripts

Language Support
Available Languages:
  • en-IN: English
  • hi-IN: Hindi
  • bn-IN: Bengali
  • gu-IN: Gujarati
  • kn-IN: Kannada
  • ml-IN: Malayalam
  • mr-IN: Marathi
  • od-IN: Odia
  • pa-IN: Punjabi
  • ta-IN: Tamil
  • te-IN: Telugu
Script Support
Available Scripts:
  • Latn: Latin (Romanized script)
  • Deva: Devanagari (Hindi, Marathi)
  • Beng: Bengali
  • Gujr: Gujarati
  • Knda: Kannada
  • Mlym: Malayalam
  • Orya: Odia
  • Guru: Gurmukhi
  • Taml: Tamil
  • Telu: Telugu

API Response Format

FieldTypeDescription
request_idstringUnique identifier for the request
language_codestringDetected language in BCP-47 format (e.g., hi-IN, ta-IN)
script_codestringDetected script code (e.g., Deva, Latn, Taml)

Supported languages: en-IN, hi-IN, bn-IN, gu-IN, kn-IN, ml-IN, mr-IN, od-IN, pa-IN, ta-IN, te-IN

1{
2 "request_id": "20241115_12345678-1234-5678-1234-567812345678",
3 "language_code": "hi-IN",
4 "script_code": "Deva"
5}

Script Codes Reference

Script CodeScript NameUsed By
LatnLatinEnglish
DevaDevanagariHindi, Marathi
BengBengaliBengali
GujrGujaratiGujarati
KndaKannadaKannada
MlymMalayalamMalayalam
OryaOdiaOdia
GuruGurmukhiPunjabi
TamlTamilTamil
TeluTeluguTelugu

Error Responses

All errors return a JSON object with an error field (message, code, request_id). The full error-code table, retry guidance, and SDK exception reference live on the central Errors & Troubleshooting page.

Errors specific to this endpoint:

HTTP StatusError CodeWhen This HappensWhat To Do
400invalid_request_errorMissing required input parameterInclude the input field with text to identify
422unprocessable_entity_errorText too long (max 1000 characters)Split text into smaller chunks
1from sarvamai import SarvamAI
2from sarvamai.core.api_error import ApiError
3
4client = SarvamAI(api_subscription_key="YOUR_SARVAM_API_KEY")
5
6try:
7 response = client.text.identify_language(
8 input="Hello, how are you?"
9 )
10 print(f"Language: {response.language_code}")
11 print(f"Script: {response.script_code}")
12except ApiError as e:
13 if e.status_code == 400:
14 print(f"Bad request: {e.body}")
15 elif e.status_code == 403:
16 print("Invalid API key. Check your credentials.")
17 elif e.status_code == 422:
18 print(f"Invalid parameters: {e.body}")
19 elif e.status_code == 429:
20 print("Rate limit exceeded. Wait and retry.")
21 else:
22 print(f"Error {e.status_code}: {e.body}")

Check out our detailed API Reference to explore Language Identification and all available options.

For detailed pricing information and usage tiers, visit our pricing page.