Saaras

Saaras-v2

Overview

Saaras-v2 is our flagship domain-aware speech recognition model, designed for production environments requiring high accuracy and robust performance.

Key Features

Domain-Aware Translation

Advanced prompting system for domain-specific translation and hotword retention, ensuring accurate context preservation.

Superior Telephony Performance

Optimized for 8KHz telephony audio with enhanced multi-speaker recognition capabilities.

Intelligent Entity Preservation

Preserves proper nouns and entities accurately across languages, maintaining context and meaning.

Automatic Language Detection

Built-in Language Identification (LID) with confidence scores for automatic language detection.

Speaker Diarization

Provides diarized outputs with precise timestamps for multi-speaker conversations through batch API.

Key Capabilities

Basic transcription with specified language code. Perfect for single-language content with clear audio quality.

1from sarvamai import SarvamAI
2
3client = SarvamAI(
4 api_subscription_key="YOUR_API_SUBSCRIPTION_KEY"
5)
6
7response = client.speech_to_text.translate(
8 file=open("audio.wav", "rb"),
9 model="saaras:v2"
10)

For detailed API documentation and advanced usage, visit our API Reference.