Call Analytics
Given an audio file of a call between two parties and a list of questions, this API analyzes the content and returns the transcript, along with responses to the questions. Each response is supported by reasoning and exact phrases extracted from the transcript.
Headers
Your subscription key
Body
The audio file to be analyzed. Must be passed as a form input if using multipart/form-data. Supported formats are WAV (.wav) and MP3 (.mp3). Optimal sample rate is 16kHz. Multi-channel audio will be merged to mono. File size must be less than 10MB and audio duration must not exceed 600 seconds (10 minutes).
List of questions to be answered based on the call content. Each question should be a valid JSON object with the following structure: {id: string, text: string, description: string (optional), type: string, properties: object}. The 'type' field must be one of: boolean, enum, short answer, long answer, or number. For 'enum' type questions, include an 'options' list in the properties.
Optional comma-separated string of keywords specific to your domain. These keywords will be preserved as-is in the transcript.
Response
Represents the response from the "call-analytics" API.
This model encapsulates the results of analyzing a call, including the transcript, answers to predefined questions, and metadata about the analysis job.
Attributes: file_name (str | None): Unique identifier for the analyzed audio file. transcript (str): Full transcript of the call generated by Sarvam's inhouse speech-to-text model. answers (Optional[List[QNAResponse]]): List of answers to predefined questions, derived from the call analysis. It can be null if no valid answers were generated. duration_in_seconds (float): Duration of the analyzed call in seconds.
Full transcript of the call generated by Sarvam's inhouse speech-to-text model.
Unique identifier for the analyzed audio file.
List of answers to predefined questions, derived from the call analysis. It can be null if no valid answers were generated.
Duration of the analyzed call in seconds.