Call Analytics Cookbook
Overview
This cookbook demonstrates a robust, production-ready call analytics pipeline using Sarvam’s SDK. It leverages the Sarvam’s Speech-to-Text Translate Batch API with diarization, parses speaker-wise transcripts, and uses Sarvam’s LLM for deep analysis. All outputs are saved in structured files for further review.
Business Value of Call Analytics Module
- Improve agent effectiveness
- Understand customer sentiment
- Detect operational issues early
- Spot upsell/cross-sell opportunities
- Generate real-time dashboards
Where is it useful
- E-commerce / D2C: Understand refund requests, delivery concerns, or dissatisfaction with product quality.
- Contact Centers / BPOs: Automate call reviews to improve training and ensure compliance at scale.
- Healthcare & Insurance: Analyze patient queries, support delays, and sentiment in sensitive service calls.
Why Diarization and Speaker-wise Parsing?
- Diarization assigns speaker labels, linking each line of text to the speaker who said it. This enables:
- Accurate agent/customer identification
- Speaker-specific sentiment analysis
- Monitoring agent talk-time vs. listening time
- Speaker-wise parsing preserves the chronological flow, enabling deeper insights and more accurate LLM analysis.
You can find sample audio files in the GitHub cookbook.
1. Install the SDK and Dependencies
Before you begin, ensure you have the necessary Python libraries installed. Run the following command in your terminal or notebook:
2. Authentication
To use the API, you need an API subscription key. Follow these steps to set up your API key:
- Obtain your API key: If you don’t have an API key, sign up on the Sarvam AI Dashboard to get one.
- Replace the placeholder key: In the Full Workflow below, replace “YOUR_SARVAM_AI_API_KEY” with your actual API key.
3. Set Up Essential Modules and Output Directory
Set up your imports and create an output directory for all generated files (transcripts, analysis, summaries, etc.):
4. The Call Analytics Class
This class encapsulates the full workflow: splitting audio, batch transcription with diarization, parsing, analysis, Q&A, and summary generation.
split_audio
Splits a long audio file into smaller chunks if its duration exceeds 1 hour, since the Batch API can process up to 1 hour per file.
4.1 Class definition and initialization
4.1.1: process_audio_file
Creates a transcription job using Sarvam’s STT Batch API, waits for job completion, downloads and parses transcription output, and calls analysis on the parsed conversation.
For longer audio files, make sure to set the timeout
parameter in upload_files
to a sufficiently high value to allow the upload to complete successfully.
4.1.2: _parse_transcriptions
Reads downloaded JSON transcription files, extracts speaker-wise lines, writes a .txt
file with clean conversation format, and calculates total speaking time per speaker.
timing.json
: Tracks the total speaking time per speaker in seconds. Beneficial in identifying the dominant speaker and helps in monitoring Agent Talk-Time vs. Listening Time.
4.1.3: analyze_transcription
Reads the conversation file and sends it to Sarvam LLM with a detailed analysis prompt to extract structured insights.
4.1.4: answer_question
Answers a user-defined question based on the parsed conversation transcript and saves the answer to a file.
4.1.5: get_summary
Generates a concise summary for each call analysis and saves it to a summary file for easy review.
5. Full Workflow Example
This code block runs the full call analytics workflow: transcribes the audio, analyzes the conversation, answers a specific user question, and generates a concise summary.
6. Sample Output
This is the sample output of the analysis you will get if you upload the file Sample_product_refund.mp3.
7. Additional Resources
For more details, refer to our official documentation and join our community for support:
- Documentation: docs.sarvam.ai
- Community: Join the Discord Community
8. Final Notes
- Keep your API key secure.
- Use clear audio for best results.
- All outputs (transcripts, analysis, summaries) are saved in the
outputs/
directory for easy review.
Keep Building!