Batch Speech-to-Text Translate (STTT) API Tutorial Using Saaras Model

Overview

This guide demonstrates how to use Sarvam AI’s Batch Speech-to-Text Translate (STTT) API for translating audio files at scale. You’ll learn both synchronous and asynchronous usage patterns, understand key parameters, and see how to upload files, poll for job completion, and download results.

1. Installation

Install the Sarvam AI Python SDK:

1 !pip install -U sarvamai

2. API Key Setup

Get your API key: Sign up at the Sarvam AI Dashboard to obtain your API key.
Set your API key: Replace "YOUR_API_KEY_HERE" in the code below with your actual key.

1 API_KEY = "YOUR_API_KEY_HERE"

3. STT Parameters

Job Parameters

File Upload

wait_until_complete

Sets up the job configuration for the STT batch process.

Parameters:

model: Translation model to use (e.g., “saaras:v2.5”)
with_diarization: If True, enables speaker diarization
num_speakers: Number of speakers (used with diarization)
prompt: Optional prompt to guide translation style/context

4. Synchronous STTT Batch Example

1 from pathlib import Path
2 from sarvamai import SarvamAI
3 
4 API_KEY = "YOUR_API_KEY_HERE"
5 audio_files = ["/path/to/your/audio1.mp3", "/path/to/your/audio2.mp3"]  # Update with your file paths
6 output_dir = Path("/output")
7 output_dir.mkdir(exist_ok=True)
8 
9 def run_sttt_sync():
10     client = SarvamAI(api_subscription_key=API_KEY)
11 
12     # Create and configure batch STTT job
13     job = client.speech_to_text_translate_job.create_job(
14         model="saaras:v2.5",
15         with_diarization=True,
16         num_speakers=2,
17         prompt="Official meeting"
18     )
19 
20     print(f"Job created: {job._job_id}")
21     
22     # Upload and process files
23     job.upload_files(file_paths=audio_files, timeout=120.0)
24     job.start()
25     print("Translation started...")
26     
27     # Wait for completion
28     job.wait_until_complete(poll_interval=5, timeout=600)
29 
30     # Check file-level results
31     file_results = job.get_file_results()
32 
33     print(f"\nSuccessful: {len(file_results['successful'])}")
34     for f in file_results['successful']:
35         print(f"  ✓ {f['file_name']}")
36 
37     print(f"\nFailed: {len(file_results['failed'])}")
38     for f in file_results['failed']:
39         print(f"  ✗ {f['file_name']}: {f['error_message']}")
40 
41     # Handle all files failed
42     if len(file_results['successful']) == 0:
43         print("\nAll files failed.")
44         return
45 
46     # Download outputs for successful files
47     job.download_outputs(output_dir=str(output_dir))
48     print(f"\nDownloaded {len(file_results['successful'])} file(s) to: {output_dir}")
49 
50 run_sttt_sync()

5. Asynchronous STTT Batch Example

1 import asyncio
2 from pathlib import Path
3 from sarvamai import AsyncSarvamAI
4 
5 API_KEY = "YOUR_API_KEY_HERE"
6 audio_files = ["/path/to/your/audio1.mp3", "/path/to/your/audio2.mp3"]  # Update with your file paths
7 output_dir = Path("/output")
8 output_dir.mkdir(exist_ok=True)
9 
10 async def run_sttt_async_job():
11     client = AsyncSarvamAI(api_subscription_key=API_KEY)
12 
13     # Create and configure batch STTT job
14     job = await client.speech_to_text_translate_job.create_job(
15         model="saaras:v2.5",
16         with_diarization=True,
17         num_speakers=2,
18         prompt="Official meeting"
19     )
20 
21     print(f"Job created: {job._job_id}")
22     
23     # Upload and process files
24     await job.upload_files(file_paths=audio_files, timeout=120.0)
25     await job.start()
26     print("Translation started...")
27     
28     # Wait for completion
29     await job.wait_until_complete(poll_interval=5, timeout=600)
30 
31     # Check file-level results
32     file_results = await job.get_file_results()
33 
34     print(f"\nSuccessful: {len(file_results['successful'])}")
35     for f in file_results['successful']:
36         print(f"  ✓ {f['file_name']}")
37 
38     print(f"\nFailed: {len(file_results['failed'])}")
39     for f in file_results['failed']:
40         print(f"  ✗ {f['file_name']}: {f['error_message']}")
41 
42     # Handle all files failed
43     if len(file_results['successful']) == 0:
44         print("\nAll files failed.")
45         return
46 
47     # Download outputs for successful files
48     await job.download_outputs(output_dir=str(output_dir))
49     print(f"\nDownloaded {len(file_results['successful'])} file(s) to: {output_dir}")
50 
51 # For Jupyter environments:
52 import nest_asyncio
53 nest_asyncio.apply()
54 await run_sttt_async_job()

6. Tips & Best Practices

Audio Quality: Use clear audio for best results.
Diarization: Set with_diarization=True and specify num_speakers for multi-speaker audio.
Job Timeouts: During high load periods, jobs can take longer than 10 minutes to complete. Instead of relying solely on wait_until_complete, consider storing the job ID and periodically querying the status endpoint. Note that the SDK raises timeout errors after 600 seconds (10 minutes).
Polling: Adjust poll_interval and timeout based on expected job duration and file size. For longer jobs, consider increasing the timeout or implementing manual status checking.
Output: Results are saved in the specified output_dir.
API Key Security: Keep your API key confidential.

7. Error Handling

You may encounter these errors while using the API:

403 Forbidden (invalid_api_key_error)
- Cause: Invalid API key.
- Solution: Use a valid API key from the Sarvam AI Dashboard.
429 Too Many Requests (insufficient_quota_error)
- Cause: Exceeded API quota.
- Solution: Check your usage, upgrade if needed, or implement exponential backoff when retrying.
500 Internal Server Error (internal_server_error)
- Cause: Issue on our servers.
- Solution: Try again later. If persistent, contact support.
400 Bad Request (invalid_request_error)
- Cause: Incorrect request formatting.
- Solution: Verify your request structure, and parameters.
422 Unprocessable Entity Request (unprocessable_entity_error)
- Cause: Unable to detect the language of the input text.
- Solution: Explicitly pass the source_language_code parameter with a supported language.

8. Additional Resources

For more details, refer to the our official documentation and we are always there to support and help you on our Discord Server:

Documentation: docs.sarvam.ai
Community: Join the Discord Community

9. Final Notes

Keep your API key secure.
Use clear audio for best results.
Check audio quality and supported formats.
Increase timeout for large files or slow networks.

Keep Building! 🚀