For AI agents: a documentation index is available at the root level at /llms.txt and /llms-full.txt. Append /llms.txt to any URL for a page-level index, or .md for the markdown version of any page.
CommunityAPI StatusAPI PricingSign Up
DocumentationAPI ReferencesCookbookIntegrationDeveloper Tools
DocumentationAPI ReferencesCookbookIntegrationDeveloper Tools
  • Getting Started
    • Welcome
    • Quickstart
    • SDKs & Libraries
    • Building for Indian Languages
    • Models
    • Credits & Rate Limits
    • Errors & Troubleshooting
    • Talk to us
    • Pricing
    • Changelog
  • API Guides & Tutorials
LogoLogo
CommunityAPI StatusAPI PricingSign Up
On this page
  • June 2026
  • May 2026
  • April 2026
  • March 2026
  • February 2026
  • January 2026
  • December 2025
  • November 2025
  • October 2025
  • September 2025
  • August 2025
  • July 2025
  • June 2025
  • May 2025
  • April 2025
  • March 2025
  • February 2025
  • January 2025
Getting Started

Change Log

||View as Markdown|
Was this page helpful?
Previous

Chat Completions Overview

Next
Built with

June 2026

Chat Completion

Sarvam-M Deprecated: sarvam-m has been deprecated and is no longer accepted by the Chat Completions API — requests that pass model="sarvam-m" will fail. The supported chat models are now sarvam-30b (64K context) and sarvam-105b (128K context). Existing integrations should migrate to Sarvam-30B or Sarvam-105B for better quality and longer context.


May 2026

Chat Completion

Removed Model Variants: Retired the fixed-context sarvam-30b-16k and sarvam-105b-32k variants. The base sarvam-30b and sarvam-105b models now serve their full context windows (64K and 128K tokens), so you no longer need to pick a context-specific variant.

Pricing

Document Digitization: Document Intelligence has been renamed to Document Digitization and repriced to ₹0.5/page, down from ₹1.5/page, making bulk document processing significantly cheaper.

Free Credits: New accounts now receive ₹100 in free credits (previously ₹1,000) — worth noting if you provision trial accounts programmatically.


April 2026

Chat Completion

Updated Parameter Defaults: Adjusted the default values for reasoning_effort and max_tokens in the Chat Completions API. If you rely on the defaults rather than setting these explicitly, response length and latency may change — set them explicitly to lock in previous behavior.

Deprecated Model Variants: Marked the sarvam-105b-32k and sarvam-30b-16k variants as deprecated ahead of their removal, signalling customers to move to the base sarvam-30b / sarvam-105b models.

Speech to Text (STT)

VAD Parameters: Added Voice Activity Detection parameters to STT WebSocket streaming, giving you control over how speech segments are detected and finalized in real-time transcription.

Integrations

n8n: Published the Sarvam AI n8n integration, letting you call Sarvam APIs from n8n workflows as drop-in automation nodes.


March 2026

Chat Completion

Removed Gemma Models: Removed the Gemma model IDs from the SarvamModelIds enum. These are no longer accepted values for the model parameter on the Chat Completions API.

Removed Wiki Grounding: Wiki grounding is no longer supported for sarvam-30b and sarvam-105b. Requests that previously enabled grounding on these models should drop the parameter.

SDK Chat Support: Added Chat Completions (LLM) support to the Python and TypeScript SDKs, so you can call the chat API through a typed client instead of raw HTTP requests.

Speech to Text (STT)

Pronunciation Dictionary v2: Added pronunciation dictionary v2, giving you finer control over how specific words and names are transcribed.

Document Digitization

Page Limit & Pricing: Introduced a 10-page-per-request limit and updated pricing to ₹1.5/page. Split larger documents into 10-page batches to stay within the limit.


February 2026

Speech to Text (STT)

Saaras v3: Migrated STT to saaras:v3 and introduced a new mode parameter to control transcription behavior. v3 improves accuracy and expands language coverage; ASR v3 is also now available through the SDKs.

Text to Speech (TTS)

Bulbul v3: Promoted the Bulbul TTS model from bulbul:v3-beta to the stable bulbul:v3. Update your model value from the beta tag to bulbul:v3, which is now production-ready.

Document Digitization

SDK Support: Added Document Digitization support to the Python and JavaScript SDKs, so document parsing can be called through the official clients.


January 2026

Speech to Text (STT)

Removed input_audio_codec: Removed the input_audio_codec parameter — the API now auto-detects the codec for supported formats. You can stop sending this parameter; it will be ignored.

TTS & ASR

v3 Support: Added TTS v3 and ASR v3 support across the SDKs, bringing the latest Bulbul and Saaras model generations to the official clients.


December 2025

Speech to Text (STT)

Added input_audio_codec: Introduced the input_audio_codec parameter, letting you explicitly declare the audio codec when automatic detection isn’t sufficient (for example, raw PCM streams).

Translate & Transliterate

Deprecated enable_preprocessing: Deprecated the enable_preprocessing parameter on the Translate and Transliterate APIs. Preprocessing behavior is now handled internally, so the parameter no longer needs to be set.

SDKs

TypeScript SDK: Released an updated TypeScript SDK version with improved generator settings and type coverage.


November 2025

Speech to Text (STT)

Sample Rate Configuration: Added sample-rate configuration parameters to the STT API, so you can match the API to your source audio (e.g. 8 kHz telephony or 16 kHz) for more accurate transcription.


October 2025

Dashboard

API Key Usage Tracking: You can now track the usage of each API key directly from your dashboard, making it easier to monitor and manage your API consumption.


September 2025

Speech to Text (STT)

Flush Signal in WebSocket: STT and STT Translate WebSocket now support flush signal to finalize transcriptions cleanly between segments, enabling better control over transcription boundaries.

8 kHz Sample Rate Support: Streaming STT now supports 8 kHz sample rate, making it easier to work with telephony and low-bandwidth audio applications.

Text to Speech (TTS)

End Signal in WebSocket: Sarvam TTS WebSocket now supports an end signal for smoother control of audio streams, allowing better integration with real-time applications.

Dashboard

GST Invoices: GST invoices are now available in the dashboard for streamlined billing and compliance.

Pricing

Flexible Pricing Plans: Released flexible pricing tiers to match different use cases, from prototyping to production. View details at dashboard.sarvam.ai/pricing.


August 2025

Speech to Text (STT)

PCM Audio Format Support: Added support for PCM formats including pcm_s16le, pcm_l16, and pcm_raw for both STT and STT Translate APIs.

  • For most audio formats, our API automatically detects the codec
  • When using PCM formats, you must explicitly specify the input_audio_codec parameter
  • PCM files are only supported at 16kHz sample rate

July 2025

SDKs

Python SDK Long Audio Support: Our Python SDK now supports processing long audio files up to 1 hour in duration through both synchronous and asynchronous methods.

Ideal for processing:

  • Meetings, interviews, and call center recordings
  • Large-scale content processing pipelines

Key Features:

  • Support for files up to 1 hour long
  • Speaker diarization and chunk-level timestamp support

Speech to Text (STT)

STT WebSocket Update: The Start Event and End Event is now returned by STT/STTT WebSockets, enabling better control and event handling in real-time transcription.


June 2025

Translation

Sarvam-Translate Launched: Released sarvam-translate:v1, an open-weights model supporting 22 Indian languages.

Text Quickstart Updated: Added sarvam-translate usage examples to the Python SDK Quickstart and Playground.

Speech to Text (STT)

Real-Time STT via WebSocket: Added WebSocket support for live transcription with ultra-low latency in both Python and JavaScript SDKs.

Audio Format Support Expanded: STT now accepts mp3, wav, aac, aiff, ogg/opus, flac, mp4/m4a, and amr input formats.

Batch STT (Alpha): Introduced alpha support for batch transcription in Python SDK. Install via pip install sarvamai==0.1.11a2.

Saaras & Saarika WebSocket Support: Both saaras and saarika models now support real-time streaming via WebSocket.

Text to Speech (TTS)

Real-Time TTS Streaming: Generate speech on the fly using WebSocket streaming. Available in Python and JavaScript SDKs.

Audio Format Support Expanded: TTS now outputs in mp3, linear16, mulaw, alaw, opus, flac, aac, and wav.

SDKs

Python SDK v4.23.2: Includes real-time streaming support, batch transcription (alpha), new translation APIs, and model updates.

JavaScript SDK Updated: Added real-time STT & TTS WebSocket support and updated documentation with streaming quickstarts.

Models

New Model Added: Introduced sarvam-m to the model family.

Model Updates: saarika and saaras updated to v2.5 with character limit improvements and stability enhancements.

Documentation

Cookbooks Expanded: Added examples for sarvam-translate, updated LID and chat completion cookbooks, updated with the SDK package and refreshed starter notebooks.

AI-Powered Docs Assistant: Added an AI assistant to the documentation search bar for instant Q&A and developer support.

Dashboard

Usage Analytics Dashboard: Released real-time API usage and credit tracking at dashboard.sarvam.ai/usage.


May 2025

Models

Sarvam-M Released

Introduced sarvam-m, a 24B open-weights hybrid model based on Mistral Small.

  • Available now via API Playground and detailed in the technical blog.
  • Benchmarks added to the official documentation.

Saarika v2.5 Released

Improved voice quality and naturalness in saarika:v2.5, now available for use in the STT API.

TTS WebSocket (Beta)

Real-time streaming TTS now supported via WebSocket for beta users. Contact support to request access.

API & Dashboard

Upgraded Developer Dashboard

  • Unified dashboard with no-code playground and API key management in one place.
  • Easily test endpoints like LLM, TTS, STT, and Translate without writing code.
  • Prebuilt examples include Resume Translate and Hinglish Code Debug.

API Playground Enhancements

  • Added full support for live testing of Sarvam-M, Sarvam-Translate, and TTS/STT APIs.
  • Playground features instant feedback and parameter tuning without leaving the dashboard.

SDKs

Official SDKs Released

  • Python: pip install sarvamai
  • JavaScript: npm install sarvamai
  • Abstracts away HTTP and response parsing with clean, unified methods across APIs.
  • Reduces integration time from hours to minutes.

Documentation

Revamped Developer Documentation

  • New cookbooks added with real-world SDK use cases including chat completion, translation, and speech.
  • Improved API navigation and content structure for faster discovery.
  • Code snippets updated for clarity and consistency with latest SDKs.

Navigation Updates

  • Exposed core API endpoints directly in sidebar navigation.
  • Streamlined structure to help developers reach reference and guides faster.

April 2025

Text to Speech (TTS)

Bulbul v2 Released

Introduced bulbul:v2, the latest version of our Indian Text-to-Speech model.

  • More natural and expressive voice output with better emotional tone
  • Enhanced preprocessing for improved handling of mixed-language inputs

Bulbul v1 Deprecation Notice

bulbul:v1 will be officially deprecated on April 30, 2025. Users should migrate to bulbul:v2 to ensure uninterrupted service.

24kHz Audio Support

Speech generation now supports 24kHz sample rate for higher quality output.

Speech to Text (STT)

Batch ASR API Released

New Batch ASR API allows uploading up to 20 audio files (up to 60 minutes each).

  • Ideal for calls, meetings, and long-form media
  • Available in both speech-to-text and speech-to-text-translate endpoints

Real-Time API Improvements

Improved transcription speed for real-time requests.

  • 3× faster processing for up to 30s audio snippets
  • Optimized for use cases like voice bots and instant assistants

Streaming WebSocket (Beta)

WebSocket-based real-time transcription now in beta. Early access available via the Sarvam Discord community.

Text

Language Detection Enhancements

Added source_language_code to API responses.

  • Automatically detects input language
  • Improves performance on multilingual and code-switched text

Indic Translation Upgrades

Improved support for two-way translation between Indic and English.

  • Supports colloquial, modern, classical, and formal registers
  • Ideal for education, localization, and content creation platforms

New APIs Introduced

  • Transliterate API
    Converts text between writing systems while preserving pronunciation

  • Language Identification (LID) API
    Detects both language and script from input

Documentation

Cookbook Updated

  • Added new real-world examples for Batch ASR and Bulbul v2 usage
  • Included advanced translation flows (e.g., tone-aware translation)
  • Refreshed LID and transliteration notebooks
  • Integrated with latest SDK structure

Cleanup and Refactoring

Removed deprecated analytics and parse APIs from SDK for better maintainability.


March 2025

Text

Translation & Transliteration Schema Update

Translation and Transliteration APIs will now automatically detect the input language.

  • Responses now include source_language_code
  • Improves handling of multilingual and code-switched inputs
  • Enables simplified workflows with less manual language tagging

SarvamParse

SarvamParse “Small” Mode Introduced

A lightweight variant of SarvamParse is now available.

  • Lower cost
  • Faster response time
  • Ideal for real-time or cost-sensitive applications

Language Identification

LID API Released

The new Language Identification (LID) endpoint detects both language and script from raw input text.

  • Supports multiple Indian and international languages
  • Detects script types like Latin, Devanagari, Kannada, and more
  • Full API Reference: LID API Docs
  • Cookbook: Language Identification Notebook

February 2025

APIs

Sarvam Parse API Released

A new API that transforms PDF documents into structured data.

  • Accepts PDF input and returns base64-encoded XML
  • Useful for information extraction, content indexing, and document analysis
  • API Reference: Sarvam Parse Docs
  • Cookbook: Parse PDF Notebook

Doc Translate API Released

Translate full PDF documents and receive structured translated output.

  • Returns translated content as base64-encoded XML
  • Ideal for cross-lingual access to documents in enterprise, government, or education
  • API Reference: Doc Translate Docs
  • Cookbook: Doc Translate Notebook

Speech to Text (STT)

Max Duration Limit Update

To improve responsiveness and lower latency for all users, the maximum duration per STT request has been updated.

  • New limit: 30 seconds per request (previously 8 minutes)
  • Applies to both saaras and saarika models
  • For longer audio requirements, contact the team to explore tailored solutions

January 2025

Developer Experience

Meta-Prompt Introduced

Launched a universal meta-prompt to help guide any AI chat model in using Sarvam’s APIs effectively.

  • Offers structured API context for accurate prompt engineering
  • Compatible with models like Gemini, GPT, Claude, etc.
  • API Reference: Meta-Prompt Docs
  • Example: Meta-Prompt in Action

Sarvam AI Cookbook Launched

Released the official Sarvam AI Cookbook, an open-source repository with practical code examples and notebooks.

  • Covers use cases for STT, TTS, Translation, Parse, and more
  • Includes best practices, integration tips, and tutorials
  • Repository: Sarvam AI Cookbook on GitHub