Speech-to-Text Rest API
Synchronous Processing
Process short audio files with immediate response. Best for quick transcriptions and testing with a maximum duration of 30 seconds.
Saarika: Speech to Text Transcription Model
Saarika is a speech-to-text transcription model that excels in handling multi-speaker content, mixed language content, and conference recordings. It offers automatic code-mixing and enhanced multilingual support, making it ideal for a wide range of applications.
Automatic Language Detection: Set language_code to "unknown" to enable automatic language detection. The API will identify the spoken language and return the transcript along with the detected language code.
The input_audio_codec is an optional parameter. Our API automatically detects all codec formats, so you don’t necessarily need to pass this parameter. However, for PCM files specifically (pcm_s16le, pcm_l16, pcm_raw), you must pass this parameter. Note that PCM files are supported only at 16kHz sample rate.
Code Examples for Speech to Text Transcription
Check out our detailed API Reference to explore Speech To Text Transcription and all available options.
Saaras Model: SOTA Speech to Text Translation Model
Saaras is a domain-aware translation model with enhanced telephony support and intelligent entity preservation. It is designed to handle complex language variations and domain-specific content, making it ideal for call center and telephony applications.
The input_audio_codec is an optional parameter. Our API automatically detects all codec formats, so you don’t necessarily need to pass this parameter. However, for PCM files specifically (pcm_s16le, pcm_l16, pcm_raw), you must pass this parameter. Note that PCM files are supported only at 16kHz sample rate.
Code Examples for Speech to Text Translation
Check out our detailed API Reference to explore Speech To Text Translation and all available options.
API Response Format
Speech to Text Transcription Response
Speech to Text Translation Response
Supported source languages: hi-IN, bn-IN, kn-IN, ml-IN, mr-IN, od-IN, pa-IN, ta-IN, te-IN, gu-IN, en-IN
Error Responses
All errors return a JSON object with an error field containing details about what went wrong.
Error Response Structure
Error Codes Reference
Example Error Response
Error Handling Code Example
Next Steps
Need help? Contact us on discord for guidance.