Change Log | Sarvam API Docs

June 2025

Translation

Sarvam-Translate Launched: Released sarvam-translate:v1, an open-weights model supporting 22 Indian languages.

Text Quickstart Updated: Added sarvam-translate usage examples to the Python SDK Quickstart and Playground.

Speech to Text (STT)

Real-Time STT via WebSocket: Added WebSocket support for live transcription with ultra-low latency in both Python and JavaScript SDKs.

Audio Format Support Expanded: STT now accepts mp3, wav, aac, aiff, ogg/opus, flac, mp4/m4a, and amr input formats.

Batch STT (Alpha): Introduced alpha support for batch transcription in Python SDK. Install via pip install sarvamai==0.1.11a2.

Saaras & Saarika WebSocket Support: Both saaras and saarika models now support real-time streaming via WebSocket.

Text to Speech (TTS)

Real-Time TTS Streaming: Generate speech on the fly using WebSocket streaming. Available in Python and JavaScript SDKs.

Audio Format Support Expanded: TTS now outputs in mp3, linear16, mulaw, alaw, opus, flac, aac, and wav.

SDKs

Python SDK v4.23.2: Includes real-time streaming support, batch transcription (alpha), new translation APIs, and model updates.

JavaScript SDK Updated: Added real-time STT & TTS WebSocket support and updated documentation with streaming quickstarts.

Models

New Model Added: Introduced sarvam-m to the model family.

Model Updates: saarika and saaras updated to v2.5 with character limit improvements and stability enhancements.

Documentation

Cookbooks Expanded: Added examples for sarvam-translate, updated LID and chat completion cookbooks, updated with the SDK package and refreshed starter notebooks.

AI-Powered Docs Assistant: Added an AI assistant to the documentation search bar for instant Q&A and developer support.

Dashboard

Usage Analytics Dashboard: Released real-time API usage and credit tracking at dashboard.sarvam.ai/usage.

May 2025

Models

Sarvam-M Released

Introduced sarvam-m, a 24B open-weights hybrid model based on Mistral Small.

Available now via API Playground and detailed in the technical blog.
Benchmarks added to the official documentation.

Saarika v2.5 Released

Improved voice quality and naturalness in saarika:v2.5, now available for use in the STT API.

TTS WebSocket (Beta)

Real-time streaming TTS now supported via WebSocket for beta users. Contact support to request access.

API & Dashboard

Upgraded Developer Dashboard

Unified dashboard with no-code playground and API key management in one place.
Easily test endpoints like LLM, TTS, STT, and Translate without writing code.
Prebuilt examples include Resume Translate and Hinglish Code Debug.

API Playground Enhancements

Added full support for live testing of Sarvam-M, Sarvam-Translate, and TTS/STT APIs.
Playground features instant feedback and parameter tuning without leaving the dashboard.

SDKs

Official SDKs Released

Python: pip install sarvamai
JavaScript: npm install sarvamai
Abstracts away HTTP and response parsing with clean, unified methods across APIs.
Reduces integration time from hours to minutes.

Documentation

Revamped Developer Documentation

New cookbooks added with real-world SDK use cases including chat completion, translation, and speech.
Improved API navigation and content structure for faster discovery.
Code snippets updated for clarity and consistency with latest SDKs.

Exposed core API endpoints directly in sidebar navigation.
Streamlined structure to help developers reach reference and guides faster.

April 2025

Text to Speech (TTS)

Bulbul v2 Released

Introduced bulbul:v2, the latest version of our Indian Text-to-Speech model.

More natural and expressive voice output with better emotional tone
Enhanced preprocessing for improved handling of mixed-language inputs

Bulbul v1 Deprecation Notice

bulbul:v1 will be officially deprecated on April 30, 2025. Users should migrate to bulbul:v2 to ensure uninterrupted service.

24kHz Audio Support

Speech generation now supports 24kHz sample rate for higher quality output.

Speech to Text (STT)

Batch ASR API Released

New Batch ASR API allows uploading up to 20 audio files (up to 60 minutes each).

Ideal for calls, meetings, and long-form media
Available in both speech-to-text and speech-to-text-translate endpoints

Real-Time API Improvements

Improved transcription speed for real-time requests.

3× faster processing for up to 30s audio snippets
Optimized for use cases like voice bots and instant assistants

Streaming WebSocket (Beta)

WebSocket-based real-time transcription now in beta. Early access available via the Sarvam Discord community.

Text

Language Detection Enhancements

Added source_language_code to API responses.

Automatically detects input language
Improves performance on multilingual and code-switched text

Indic Translation Upgrades

Improved support for two-way translation between Indic and English.

Supports colloquial, modern, classical, and formal registers
Ideal for education, localization, and content creation platforms

New APIs Introduced

Transliterate API
Converts text between writing systems while preserving pronunciation
Language Identification (LID) API
Detects both language and script from input

Documentation

Cookbook Updated

Added new real-world examples for Batch ASR and Bulbul v2 usage
Included advanced translation flows (e.g., tone-aware translation)
Refreshed LID and transliteration notebooks
Integrated with latest SDK structure

Cleanup and Refactoring

Removed deprecated analytics and parse APIs from SDK for better maintainability.

March 2025

Text

Translation & Transliteration Schema Update

Translation and Transliteration APIs will now automatically detect the input language.

Responses now include source_language_code
Improves handling of multilingual and code-switched inputs
Enables simplified workflows with less manual language tagging

SarvamParse

SarvamParse “Small” Mode Introduced

A lightweight variant of SarvamParse is now available.

Lower cost
Faster response time
Ideal for real-time or cost-sensitive applications

Language Identification

LID API Released

The new Language Identification (LID) endpoint detects both language and script from raw input text.

Supports multiple Indian and international languages
Detects script types like Latin, Devanagari, Kannada, and more
Full API Reference: LID API Docs
Cookbook: Language Identification Notebook

February 2025

APIs

Sarvam Parse API Released

A new API that transforms PDF documents into structured data.

Accepts PDF input and returns base64-encoded XML
Useful for information extraction, content indexing, and document analysis
API Reference: Sarvam Parse Docs
Cookbook: Parse PDF Notebook

Doc Translate API Released

Translate full PDF documents and receive structured translated output.

Returns translated content as base64-encoded XML
Ideal for cross-lingual access to documents in enterprise, government, or education
API Reference: Doc Translate Docs
Cookbook: Doc Translate Notebook

Speech to Text (STT)

Max Duration Limit Update

To improve responsiveness and lower latency for all users, the maximum duration per STT request has been updated.

New limit: 30 seconds per request (previously 8 minutes)
Applies to both saaras and saarika models
For longer audio requirements, contact the team to explore tailored solutions

January 2025

Developer Experience

Meta-Prompt Introduced

Launched a universal meta-prompt to help guide any AI chat model in using Sarvam’s APIs effectively.

Offers structured API context for accurate prompt engineering
Compatible with models like Gemini, GPT, Claude, etc.
API Reference: Meta-Prompt Docs
Example: Meta-Prompt in Action

Sarvam AI Cookbook Launched

Released the official Sarvam AI Cookbook, an open-source repository with practical code examples and notebooks.

Covers use cases for STT, TTS, Translation, Parse, and more
Includes best practices, integration tips, and tutorials
Repository: Sarvam AI Cookbook on GitHub

June 2025

Translation

Speech to Text (STT)

Text to Speech (TTS)

SDKs

Models

Documentation

Dashboard

May 2025

Models

Sarvam-M Released

Saarika v2.5 Released

TTS WebSocket (Beta)

API & Dashboard

Upgraded Developer Dashboard

API Playground Enhancements

SDKs

Official SDKs Released

Documentation

Revamped Developer Documentation

Navigation Updates

April 2025

Text to Speech (TTS)

Bulbul v2 Released

Bulbul v1 Deprecation Notice

24kHz Audio Support

Speech to Text (STT)

Batch ASR API Released

Real-Time API Improvements

Streaming WebSocket (Beta)

Text

Language Detection Enhancements

Indic Translation Upgrades

New APIs Introduced

Documentation

Cookbook Updated

Cleanup and Refactoring

March 2025

Text

Translation & Transliteration Schema Update

SarvamParse

SarvamParse “Small” Mode Introduced

Language Identification

LID API Released

February 2025

APIs

Sarvam Parse API Released

Doc Translate API Released

Speech to Text (STT)

Max Duration Limit Update

January 2025

Developer Experience

Meta-Prompt Introduced

Sarvam AI Cookbook Launched