For AI agents: a documentation index is available at the root level at /llms.txt and /llms-full.txt. Append /llms.txt to any URL for a page-level index, or .md for the markdown version of any page.
CommunityAPI StatusAPI PricingSign Up
DocumentationAPI ReferencesCookbookIntegrationDeveloper Tools
DocumentationAPI ReferencesCookbookIntegrationDeveloper Tools
  • Getting Started
    • Welcome
    • Quickstart
    • SDKs & Libraries
    • Building for Indian Languages
    • Models
    • Credits & Rate Limits
    • Errors & Troubleshooting
    • Talk to us
    • Pricing
    • Changelog
  • API Guides & Tutorials
      • Overview
LogoLogo
CommunityAPI StatusAPI PricingSign Up
On this page
  • Model Variants
  • Features
  • Code Examples
  • API Response Format
  • Success Response Structure
  • Response Fields
  • Error Responses
  • Error Response Structure
  • Error Codes Reference
  • Example Error Response
API Guides & TutorialsChat Completion

Chat Completions Overview

||View as Markdown|
Was this page helpful?
Previous

How to list your chat messages

Next
Built with

Sarvam AI provides powerful chat completion APIs designed to build intelligent conversational AI experiences, with native support for Indian languages and deep contextual reasoning.

Our Chat Completion APIs support the following chat models:

Sarvam-30B

30B parameter model with strong reasoning and Indic language support. Balanced performance-to-cost ratio for production workloads.

Sarvam-105B

105B parameter flagship model. Highest quality outputs for complex reasoning, coding, and generation tasks.

Model Variants

ModelContext LengthUse Case
sarvam-30b64K tokensStandard conversations, Q&A, and general tasks
sarvam-105b128K tokensComplex reasoning, coding, and high-quality generation

Simply pass the model name as the model parameter (e.g., model="sarvam-105b").

Sarvam-M (24B) is now a legacy model. It remains available via model="sarvam-m" but we recommend migrating to Sarvam-30B or Sarvam-105B for improved performance.

Features

Hybrid Thinking Mode
  • Supports both “think” and “non-think” modes
  • Think mode for complex logical reasoning
  • Non-think mode for efficient conversations
  • Ideal for mathematical and coding tasks
Advanced Indic Skills
  • Post-trained on Indian languages
  • Native English proficiency
  • Authentic Indian cultural values
  • Rich understanding of local context
Superior Reasoning Capabilities
  • Outperforms similar-sized models
  • Strong performance on coding tasks
  • Excellent mathematical reasoning
  • Advanced problem-solving abilities
Seamless Chatting Experience
  • Full Indic script support
  • Romanized language support
  • Multilingual conversation handling
  • Natural language understanding

Code Examples

Basic Chat Completion
Multi-turn Conversation
Hindi (Indic Script)
1from sarvamai import SarvamAI
2
3client = SarvamAI(
4 api_subscription_key="YOUR_SARVAM_API_KEY",
5)
6response = client.chat.completions(
7 model="sarvam-105b",
8 messages=[
9 {"role": "user", "content": "Hey, what is the capital of India?"}
10 ],
11)
12print(response)
Key Considerations
  • Reasoning effort options: low, medium, high

    • Setting any value enables thinking mode
    • Higher values increase reasoning depth

API Response Format

Success Response Structure

1{
2 "id": "chatcmpl-abc123",
3 "object": "chat.completion",
4 "created": 1699000000,
5 "model": "sarvam-105b",
6 "choices": [
7 {
8 "index": 0,
9 "message": {
10 "role": "assistant",
11 "content": "The capital of India is New Delhi. It has been the capital since 1931."
12 },
13 "finish_reason": "stop"
14 }
15 ],
16 "usage": {
17 "prompt_tokens": 15,
18 "completion_tokens": 25,
19 "total_tokens": 40
20 }
21}

Response Fields

FieldTypeDescription
idstringUnique identifier for the completion request
objectstringAlways "chat.completion"
createdintegerUnix timestamp when the completion was created
modelstringThe model used for completion
choices[].indexintegerIndex of the choice in the list
choices[].message.rolestringAlways "assistant"
choices[].message.contentstringThe generated text response
choices[].message.reasoning_contentstringThinking steps (only when reasoning_effort is set)
choices[].finish_reasonstringWhy generation stopped: "stop", "length", "content_filter"
usage.prompt_tokensintegerTokens in the input prompt
usage.completion_tokensintegerTokens in the generated response
usage.total_tokensintegerTotal tokens used (prompt + completion)

Error Responses

All errors return a JSON object with an error field containing details about what went wrong.

Error Response Structure

1{
2 "error": {
3 "message": "Human-readable error description",
4 "code": "error_code_for_programmatic_handling",
5 "request_id": "unique_request_identifier"
6 }
7}

Error Codes Reference

HTTP StatusError CodeWhen This HappensWhat To Do
400invalid_request_errorMissing messages array or malformed requestInclude valid messages array with role/content
403invalid_api_key_errorAPI key is invalid, missing, or expiredVerify your API key in the dashboard
422unprocessable_entity_errorInvalid model name or parameter valuesCheck temperature (0-2), model name, etc.
429insufficient_quota_errorAPI quota or rate limit exceededWait for reset or upgrade your plan
500internal_server_errorUnexpected server errorRetry the request; contact support if persistent

Example Error Response

1{
2 "error": {
3 "message": "Invalid value for parameter 'temperature': must be between 0 and 2",
4 "code": "unprocessable_entity_error",
5 "request_id": "20241115_abc12345"
6 }
7}
Error Handling Code Example
1from sarvamai import SarvamAI
2from sarvamai.core.api_error import ApiError
3
4client = SarvamAI(api_subscription_key="YOUR_SARVAM_API_KEY")
5
6try:
7 response = client.chat.completions(
8 model="sarvam-105b",
9 messages=[
10 {"role": "user", "content": "What is the capital of India?"}
11 ],
12 )
13 print(response.choices[0].message.content)
14except ApiError as e:
15 if e.status_code == 400:
16 print(f"Bad request: {e.body}")
17 elif e.status_code == 403:
18 print("Invalid API key. Check your credentials.")
19 elif e.status_code == 422:
20 print(f"Invalid parameters: {e.body}")
21 elif e.status_code == 429:
22 print("Rate limit exceeded. Wait and retry.")
23 else:
24 print(f"Error {e.status_code}: {e.body}")

Check out our detailed API Reference to explore Chat Completion and all available options.