Sarvam-M Deprecated: sarvam-m has been deprecated and is no longer accepted by the Chat Completions API — requests that pass model="sarvam-m" will fail. The supported chat models are now sarvam-30b (64K context) and sarvam-105b (128K context). Existing integrations should migrate to Sarvam-30B or Sarvam-105B for better quality and longer context.
Removed Model Variants: Retired the fixed-context sarvam-30b-16k and sarvam-105b-32k variants. The base sarvam-30b and sarvam-105b models now serve their full context windows (64K and 128K tokens), so you no longer need to pick a context-specific variant.
Document Digitization: Document Intelligence has been renamed to Document Digitization and repriced to ₹0.5/page, down from ₹1.5/page, making bulk document processing significantly cheaper.
Free Credits: New accounts now receive ₹100 in free credits (previously ₹1,000) — worth noting if you provision trial accounts programmatically.
Updated Parameter Defaults: Adjusted the default values for reasoning_effort and max_tokens in the Chat Completions API. If you rely on the defaults rather than setting these explicitly, response length and latency may change — set them explicitly to lock in previous behavior.
Deprecated Model Variants: Marked the sarvam-105b-32k and sarvam-30b-16k variants as deprecated ahead of their removal, signalling customers to move to the base sarvam-30b / sarvam-105b models.
VAD Parameters: Added Voice Activity Detection parameters to STT WebSocket streaming, giving you control over how speech segments are detected and finalized in real-time transcription.
n8n: Published the Sarvam AI n8n integration, letting you call Sarvam APIs from n8n workflows as drop-in automation nodes.
Removed Gemma Models: Removed the Gemma model IDs from the SarvamModelIds enum. These are no longer accepted values for the model parameter on the Chat Completions API.
Removed Wiki Grounding: Wiki grounding is no longer supported for sarvam-30b and sarvam-105b. Requests that previously enabled grounding on these models should drop the parameter.
SDK Chat Support: Added Chat Completions (LLM) support to the Python and TypeScript SDKs, so you can call the chat API through a typed client instead of raw HTTP requests.
Pronunciation Dictionary v2: Added pronunciation dictionary v2, giving you finer control over how specific words and names are transcribed.
Page Limit & Pricing: Introduced a 10-page-per-request limit and updated pricing to ₹1.5/page. Split larger documents into 10-page batches to stay within the limit.
Saaras v3: Migrated STT to saaras:v3 and introduced a new mode parameter to control transcription behavior. v3 improves accuracy and expands language coverage; ASR v3 is also now available through the SDKs.
Bulbul v3: Promoted the Bulbul TTS model from bulbul:v3-beta to the stable bulbul:v3. Update your model value from the beta tag to bulbul:v3, which is now production-ready.
SDK Support: Added Document Digitization support to the Python and JavaScript SDKs, so document parsing can be called through the official clients.
Removed input_audio_codec: Removed the input_audio_codec parameter — the API now auto-detects the codec for supported formats. You can stop sending this parameter; it will be ignored.
v3 Support: Added TTS v3 and ASR v3 support across the SDKs, bringing the latest Bulbul and Saaras model generations to the official clients.
Added input_audio_codec: Introduced the input_audio_codec parameter, letting you explicitly declare the audio codec when automatic detection isn’t sufficient (for example, raw PCM streams).
Deprecated enable_preprocessing: Deprecated the enable_preprocessing parameter on the Translate and Transliterate APIs. Preprocessing behavior is now handled internally, so the parameter no longer needs to be set.
TypeScript SDK: Released an updated TypeScript SDK version with improved generator settings and type coverage.
Sample Rate Configuration: Added sample-rate configuration parameters to the STT API, so you can match the API to your source audio (e.g. 8 kHz telephony or 16 kHz) for more accurate transcription.
API Key Usage Tracking: You can now track the usage of each API key directly from your dashboard, making it easier to monitor and manage your API consumption.
Flush Signal in WebSocket: STT and STT Translate WebSocket now support flush signal to finalize transcriptions cleanly between segments, enabling better control over transcription boundaries.
8 kHz Sample Rate Support: Streaming STT now supports 8 kHz sample rate, making it easier to work with telephony and low-bandwidth audio applications.
End Signal in WebSocket: Sarvam TTS WebSocket now supports an end signal for smoother control of audio streams, allowing better integration with real-time applications.
GST Invoices: GST invoices are now available in the dashboard for streamlined billing and compliance.
Flexible Pricing Plans: Released flexible pricing tiers to match different use cases, from prototyping to production. View details at dashboard.sarvam.ai/pricing.
PCM Audio Format Support: Added support for PCM formats including pcm_s16le, pcm_l16, and pcm_raw for both STT and STT Translate APIs.
input_audio_codec parameterPython SDK Long Audio Support: Our Python SDK now supports processing long audio files up to 1 hour in duration through both synchronous and asynchronous methods.
Ideal for processing:
Key Features:
STT WebSocket Update: The Start Event and End Event is now returned by STT/STTT WebSockets, enabling better control and event handling in real-time transcription.
Sarvam-Translate Launched: Released sarvam-translate:v1, an open-weights model supporting 22 Indian languages.
Text Quickstart Updated: Added sarvam-translate usage examples to the Python SDK Quickstart and Playground.
Real-Time STT via WebSocket: Added WebSocket support for live transcription with ultra-low latency in both Python and JavaScript SDKs.
Audio Format Support Expanded: STT now accepts mp3, wav, aac, aiff, ogg/opus, flac, mp4/m4a, and amr input formats.
Batch STT (Alpha): Introduced alpha support for batch transcription in Python SDK. Install via pip install sarvamai==0.1.11a2.
Saaras & Saarika WebSocket Support: Both saaras and saarika models now support real-time streaming via WebSocket.
Real-Time TTS Streaming: Generate speech on the fly using WebSocket streaming. Available in Python and JavaScript SDKs.
Audio Format Support Expanded: TTS now outputs in mp3, linear16, mulaw, alaw, opus, flac, aac, and wav.
Python SDK v4.23.2: Includes real-time streaming support, batch transcription (alpha), new translation APIs, and model updates.
JavaScript SDK Updated: Added real-time STT & TTS WebSocket support and updated documentation with streaming quickstarts.
New Model Added: Introduced sarvam-m to the model family.
Model Updates: saarika and saaras updated to v2.5 with character limit improvements and stability enhancements.
Cookbooks Expanded: Added examples for sarvam-translate, updated LID and chat completion cookbooks, updated with the SDK package and refreshed starter notebooks.
AI-Powered Docs Assistant: Added an AI assistant to the documentation search bar for instant Q&A and developer support.
Usage Analytics Dashboard: Released real-time API usage and credit tracking at dashboard.sarvam.ai/usage.
Introduced sarvam-m, a 24B open-weights hybrid model based on Mistral Small.
Improved voice quality and naturalness in saarika:v2.5, now available for use in the STT API.
Real-time streaming TTS now supported via WebSocket for beta users. Contact support to request access.
pip install sarvamainpm install sarvamaiIntroduced bulbul:v2, the latest version of our Indian Text-to-Speech model.
bulbul:v1 will be officially deprecated on April 30, 2025. Users should migrate to bulbul:v2 to ensure uninterrupted service.
Speech generation now supports 24kHz sample rate for higher quality output.
New Batch ASR API allows uploading up to 20 audio files (up to 60 minutes each).
speech-to-text and speech-to-text-translate endpointsImproved transcription speed for real-time requests.
WebSocket-based real-time transcription now in beta. Early access available via the Sarvam Discord community.
Added source_language_code to API responses.
Improved support for two-way translation between Indic and English.
Transliterate API
Converts text between writing systems while preserving pronunciation
Language Identification (LID) API
Detects both language and script from input
Removed deprecated analytics and parse APIs from SDK for better maintainability.
Translation and Transliteration APIs will now automatically detect the input language.
source_language_codeA lightweight variant of SarvamParse is now available.
The new Language Identification (LID) endpoint detects both language and script from raw input text.
A new API that transforms PDF documents into structured data.
Translate full PDF documents and receive structured translated output.
To improve responsiveness and lower latency for all users, the maximum duration per STT request has been updated.
saaras and saarika modelsLaunched a universal meta-prompt to help guide any AI chat model in using Sarvam’s APIs effectively.
Released the official Sarvam AI Cookbook, an open-source repository with practical code examples and notebooks.