Change Log
June 2026
Chat Completion
Sarvam-M Deprecated: sarvam-m has been deprecated and is no longer accepted by the Chat Completions API — requests that pass model="sarvam-m" will fail. The supported chat models are now sarvam-30b (64K context) and sarvam-105b (128K context). Existing integrations should migrate to Sarvam-30B or Sarvam-105B for better quality and longer context.
May 2026
Chat Completion
Removed Model Variants: Retired the fixed-context sarvam-30b-16k and sarvam-105b-32k variants. The base sarvam-30b and sarvam-105b models now serve their full context windows (64K and 128K tokens), so you no longer need to pick a context-specific variant.
Pricing
Document Digitization: Document Intelligence has been renamed to Document Digitization and repriced to ₹0.5/page, down from ₹1.5/page, making bulk document processing significantly cheaper.
Free Credits: New accounts now receive ₹100 in free credits (previously ₹1,000) — worth noting if you provision trial accounts programmatically.
April 2026
Chat Completion
Updated Parameter Defaults: Adjusted the default values for reasoning_effort and max_tokens in the Chat Completions API. If you rely on the defaults rather than setting these explicitly, response length and latency may change — set them explicitly to lock in previous behavior.
Deprecated Model Variants: Marked the sarvam-105b-32k and sarvam-30b-16k variants as deprecated ahead of their removal, signalling customers to move to the base sarvam-30b / sarvam-105b models.
Speech to Text (STT)
VAD Parameters: Added Voice Activity Detection parameters to STT WebSocket streaming, giving you control over how speech segments are detected and finalized in real-time transcription.
Integrations
n8n: Published the Sarvam AI n8n integration, letting you call Sarvam APIs from n8n workflows as drop-in automation nodes.
March 2026
Chat Completion
Removed Gemma Models: Removed the Gemma model IDs from the SarvamModelIds enum. These are no longer accepted values for the model parameter on the Chat Completions API.
Removed Wiki Grounding: Wiki grounding is no longer supported for sarvam-30b and sarvam-105b. Requests that previously enabled grounding on these models should drop the parameter.
SDK Chat Support: Added Chat Completions (LLM) support to the Python and TypeScript SDKs, so you can call the chat API through a typed client instead of raw HTTP requests.
Speech to Text (STT)
Pronunciation Dictionary v2: Added pronunciation dictionary v2, giving you finer control over how specific words and names are transcribed.
Document Digitization
Page Limit & Pricing: Introduced a 10-page-per-request limit and updated pricing to ₹1.5/page. Split larger documents into 10-page batches to stay within the limit.
February 2026
Speech to Text (STT)
Saaras v3: Migrated STT to saaras:v3 and introduced a new mode parameter to control transcription behavior. v3 improves accuracy and expands language coverage; ASR v3 is also now available through the SDKs.
Text to Speech (TTS)
Bulbul v3: Promoted the Bulbul TTS model from bulbul:v3-beta to the stable bulbul:v3. Update your model value from the beta tag to bulbul:v3, which is now production-ready.
Document Digitization
SDK Support: Added Document Digitization support to the Python and JavaScript SDKs, so document parsing can be called through the official clients.
January 2026
Speech to Text (STT)
Removed input_audio_codec: Removed the input_audio_codec parameter — the API now auto-detects the codec for supported formats. You can stop sending this parameter; it will be ignored.
TTS & ASR
v3 Support: Added TTS v3 and ASR v3 support across the SDKs, bringing the latest Bulbul and Saaras model generations to the official clients.
December 2025
Speech to Text (STT)
Added input_audio_codec: Introduced the input_audio_codec parameter, letting you explicitly declare the audio codec when automatic detection isn’t sufficient (for example, raw PCM streams).
Translate & Transliterate
Deprecated enable_preprocessing: Deprecated the enable_preprocessing parameter on the Translate and Transliterate APIs. Preprocessing behavior is now handled internally, so the parameter no longer needs to be set.
SDKs
TypeScript SDK: Released an updated TypeScript SDK version with improved generator settings and type coverage.
November 2025
Speech to Text (STT)
Sample Rate Configuration: Added sample-rate configuration parameters to the STT API, so you can match the API to your source audio (e.g. 8 kHz telephony or 16 kHz) for more accurate transcription.
October 2025
Dashboard
API Key Usage Tracking: You can now track the usage of each API key directly from your dashboard, making it easier to monitor and manage your API consumption.
September 2025
Speech to Text (STT)
Flush Signal in WebSocket: STT and STT Translate WebSocket now support flush signal to finalize transcriptions cleanly between segments, enabling better control over transcription boundaries.
8 kHz Sample Rate Support: Streaming STT now supports 8 kHz sample rate, making it easier to work with telephony and low-bandwidth audio applications.
Text to Speech (TTS)
End Signal in WebSocket: Sarvam TTS WebSocket now supports an end signal for smoother control of audio streams, allowing better integration with real-time applications.
Dashboard
GST Invoices: GST invoices are now available in the dashboard for streamlined billing and compliance.
Pricing
Flexible Pricing Plans: Released flexible pricing tiers to match different use cases, from prototyping to production. View details at dashboard.sarvam.ai/pricing.
August 2025
Speech to Text (STT)
PCM Audio Format Support: Added support for PCM formats including pcm_s16le, pcm_l16, and pcm_raw for both STT and STT Translate APIs.
- For most audio formats, our API automatically detects the codec
- When using PCM formats, you must explicitly specify the
input_audio_codecparameter - PCM files are only supported at 16kHz sample rate
July 2025
SDKs
Python SDK Long Audio Support: Our Python SDK now supports processing long audio files up to 1 hour in duration through both synchronous and asynchronous methods.
Ideal for processing:
- Meetings, interviews, and call center recordings
- Large-scale content processing pipelines
Key Features:
- Support for files up to 1 hour long
- Speaker diarization and chunk-level timestamp support
Speech to Text (STT)
STT WebSocket Update: The Start Event and End Event is now returned by STT/STTT WebSockets, enabling better control and event handling in real-time transcription.
June 2025
Translation
Sarvam-Translate Launched: Released sarvam-translate:v1, an open-weights model supporting 22 Indian languages.
Text Quickstart Updated: Added sarvam-translate usage examples to the Python SDK Quickstart and Playground.
Speech to Text (STT)
Real-Time STT via WebSocket: Added WebSocket support for live transcription with ultra-low latency in both Python and JavaScript SDKs.
Audio Format Support Expanded: STT now accepts mp3, wav, aac, aiff, ogg/opus, flac, mp4/m4a, and amr input formats.
Batch STT (Alpha): Introduced alpha support for batch transcription in Python SDK. Install via pip install sarvamai==0.1.11a2.
Saaras & Saarika WebSocket Support: Both saaras and saarika models now support real-time streaming via WebSocket.
Text to Speech (TTS)
Real-Time TTS Streaming: Generate speech on the fly using WebSocket streaming. Available in Python and JavaScript SDKs.
Audio Format Support Expanded: TTS now outputs in mp3, linear16, mulaw, alaw, opus, flac, aac, and wav.
SDKs
Python SDK v4.23.2: Includes real-time streaming support, batch transcription (alpha), new translation APIs, and model updates.
JavaScript SDK Updated: Added real-time STT & TTS WebSocket support and updated documentation with streaming quickstarts.
Models
New Model Added: Introduced sarvam-m to the model family.
Model Updates: saarika and saaras updated to v2.5 with character limit improvements and stability enhancements.
Documentation
Cookbooks Expanded: Added examples for sarvam-translate, updated LID and chat completion cookbooks, updated with the SDK package and refreshed starter notebooks.
AI-Powered Docs Assistant: Added an AI assistant to the documentation search bar for instant Q&A and developer support.
Dashboard
Usage Analytics Dashboard: Released real-time API usage and credit tracking at dashboard.sarvam.ai/usage.
May 2025
Models
Sarvam-M Released
Introduced sarvam-m, a 24B open-weights hybrid model based on Mistral Small.
- Available now via API Playground and detailed in the technical blog.
- Benchmarks added to the official documentation.
Saarika v2.5 Released
Improved voice quality and naturalness in saarika:v2.5, now available for use in the STT API.
TTS WebSocket (Beta)
Real-time streaming TTS now supported via WebSocket for beta users. Contact support to request access.
API & Dashboard
Upgraded Developer Dashboard
- Unified dashboard with no-code playground and API key management in one place.
- Easily test endpoints like LLM, TTS, STT, and Translate without writing code.
- Prebuilt examples include Resume Translate and Hinglish Code Debug.
API Playground Enhancements
- Added full support for live testing of Sarvam-M, Sarvam-Translate, and TTS/STT APIs.
- Playground features instant feedback and parameter tuning without leaving the dashboard.
SDKs
Official SDKs Released
- Python:
pip install sarvamai - JavaScript:
npm install sarvamai - Abstracts away HTTP and response parsing with clean, unified methods across APIs.
- Reduces integration time from hours to minutes.
Documentation
Revamped Developer Documentation
- New cookbooks added with real-world SDK use cases including chat completion, translation, and speech.
- Improved API navigation and content structure for faster discovery.
- Code snippets updated for clarity and consistency with latest SDKs.
Navigation Updates
- Exposed core API endpoints directly in sidebar navigation.
- Streamlined structure to help developers reach reference and guides faster.
April 2025
Text to Speech (TTS)
Bulbul v2 Released
Introduced bulbul:v2, the latest version of our Indian Text-to-Speech model.
- More natural and expressive voice output with better emotional tone
- Enhanced preprocessing for improved handling of mixed-language inputs
Bulbul v1 Deprecation Notice
bulbul:v1 will be officially deprecated on April 30, 2025. Users should migrate to bulbul:v2 to ensure uninterrupted service.
24kHz Audio Support
Speech generation now supports 24kHz sample rate for higher quality output.
Speech to Text (STT)
Batch ASR API Released
New Batch ASR API allows uploading up to 20 audio files (up to 60 minutes each).
- Ideal for calls, meetings, and long-form media
- Available in both
speech-to-textandspeech-to-text-translateendpoints
Real-Time API Improvements
Improved transcription speed for real-time requests.
- 3× faster processing for up to 30s audio snippets
- Optimized for use cases like voice bots and instant assistants
Streaming WebSocket (Beta)
WebSocket-based real-time transcription now in beta. Early access available via the Sarvam Discord community.
Text
Language Detection Enhancements
Added source_language_code to API responses.
- Automatically detects input language
- Improves performance on multilingual and code-switched text
Indic Translation Upgrades
Improved support for two-way translation between Indic and English.
- Supports colloquial, modern, classical, and formal registers
- Ideal for education, localization, and content creation platforms
New APIs Introduced
-
Transliterate API
Converts text between writing systems while preserving pronunciation -
Language Identification (LID) API
Detects both language and script from input
Documentation
Cookbook Updated
- Added new real-world examples for Batch ASR and Bulbul v2 usage
- Included advanced translation flows (e.g., tone-aware translation)
- Refreshed LID and transliteration notebooks
- Integrated with latest SDK structure
Cleanup and Refactoring
Removed deprecated analytics and parse APIs from SDK for better maintainability.
March 2025
Text
Translation & Transliteration Schema Update
Translation and Transliteration APIs will now automatically detect the input language.
- Responses now include
source_language_code - Improves handling of multilingual and code-switched inputs
- Enables simplified workflows with less manual language tagging
SarvamParse
SarvamParse “Small” Mode Introduced
A lightweight variant of SarvamParse is now available.
- Lower cost
- Faster response time
- Ideal for real-time or cost-sensitive applications
Language Identification
LID API Released
The new Language Identification (LID) endpoint detects both language and script from raw input text.
- Supports multiple Indian and international languages
- Detects script types like Latin, Devanagari, Kannada, and more
- Full API Reference: LID API Docs
- Cookbook: Language Identification Notebook
February 2025
APIs
Sarvam Parse API Released
A new API that transforms PDF documents into structured data.
- Accepts PDF input and returns base64-encoded XML
- Useful for information extraction, content indexing, and document analysis
- API Reference: Sarvam Parse Docs
- Cookbook: Parse PDF Notebook
Doc Translate API Released
Translate full PDF documents and receive structured translated output.
- Returns translated content as base64-encoded XML
- Ideal for cross-lingual access to documents in enterprise, government, or education
- API Reference: Doc Translate Docs
- Cookbook: Doc Translate Notebook
Speech to Text (STT)
Max Duration Limit Update
To improve responsiveness and lower latency for all users, the maximum duration per STT request has been updated.
- New limit: 30 seconds per request (previously 8 minutes)
- Applies to both
saarasandsaarikamodels - For longer audio requirements, contact the team to explore tailored solutions
January 2025
Developer Experience
Meta-Prompt Introduced
Launched a universal meta-prompt to help guide any AI chat model in using Sarvam’s APIs effectively.
- Offers structured API context for accurate prompt engineering
- Compatible with models like Gemini, GPT, Claude, etc.
- API Reference: Meta-Prompt Docs
- Example: Meta-Prompt in Action
Sarvam AI Cookbook Launched
Released the official Sarvam AI Cookbook, an open-source repository with practical code examples and notebooks.
- Covers use cases for STT, TTS, Translation, Parse, and more
- Includes best practices, integration tips, and tutorials
- Repository: Sarvam AI Cookbook on GitHub