Language Identification API

Overview

The Language Identification (LID) API identifies the language (e.g., en-IN, hi-IN) and script (e.g., Latin, Devanagari) of the input text. It supports multiple Indian languages and scripts, making it ideal for multilingual text processing.

Detection Types

Single Language

Detect the primary language and script of text input. Example: “Hello, how are you?” → language: en-IN, script: Latn

Auto Detection

Automatic language detection for seamless integration with translation and preprocessing APIs.

Code Examples

1from sarvamai import SarvamAI
2
3client = SarvamAI(
4 api_subscription_key="YOUR_API_SUBSCRIPTION_KEY"
5)
6
7response = client.text.identify_language(
8 input="Hello, how are you?"
9)
10
11print(f"Request ID: {response.request_id}")
12print(f"Language Code: {response.language_code}") # Output: en-IN
13print(f"Script Code: {response.script_code}") # Output: Latn

Response Format

1{
2 "request_id": "string | null",
3 "language_code": "string | null",
4 "script_code": "string | null"
5}

Supported Languages and Scripts

Language Support
Available Languages:
  • en-IN: English
  • hi-IN: Hindi
  • bn-IN: Bengali
  • gu-IN: Gujarati
  • kn-IN: Kannada
  • ml-IN: Malayalam
  • mr-IN: Marathi
  • od-IN: Odia
  • pa-IN: Punjabi
  • ta-IN: Tamil
  • te-IN: Telugu
Script Support
Available Scripts:
  • Latn: Latin (Romanized script)
  • Deva: Devanagari (Hindi, Marathi)
  • Beng: Bengali
  • Gujr: Gujarati
  • Knda: Kannada
  • Mlym: Malayalam
  • Orya: Odia
  • Guru: Gurmukhi
  • Taml: Tamil
  • Telu: Telugu

For detailed pricing information and usage tiers, visit our pricing page.