Build Your First Voice Agent using LiveKit
Overview
This guide demonstrates how to build a real-time voice agent that can listen, understand, and respond naturally using LiveKit for real-time communication and Sarvam AI for speech processing. Perfect for building voice assistants, customer support bots, and conversational AI applications for Indian languages.
What You’ll Build
A voice agent that can:
- Listen to users speaking (in multiple Indian languages!)
- Understand and process their requests
- Respond back in natural-sounding voices
Quick Overview
- Get API keys (LiveKit, Sarvam, OpenAI)
- Install packages:
pip install livekit-agents[sarvam,openai,silero] python-dotenv
- Create
.env
file with your API keys - Write ~40 lines of Python code
- Run:
python agent.py dev
- Test:
python agent.py console
Quick Start
1. Prerequisites
- Python 3.9 or higher
- API keys from:
- LiveKit Cloud (free account)
- Sarvam AI (get API key from dashboard)
- OpenAI (create new secret key)
2. Install Dependencies
3. Create Environment File
Create a file named .env
in your project folder and add your API keys:
Replace the values with your actual API keys.
4. Write Your Agent
Create agent.py
:
5. Run Your Agent
6. Test Your Agent
In a new terminal, run:
That’s it! You’ve built your first voice agent!
Customization Examples
Example 1: Hindi Voice Agent
Example 2: Tamil Voice Agent
Example 3: Multilingual Agent (Auto-detect)
Example 4: Speech-to-English Agent (Saaras)
Difference: Saarika transcribes speech to text in the same language, while Saaras translates speech directly to English text. Use Saaras when user speaks Indian languages but you want to process/respond in English.
Note: Saaras automatically detects the source language (Hindi, Tamil, etc.) and translates spoken content directly to English text, making Indian language speech comprehensible to English-based LLMs.
Available Options
Language Codes
Speaker Voices (Bulbul v2)
Female Voices:
anushka
- Clear and professional (default)manisha
- Warm and friendlyvidya
- Articulate and precisearya
- Young and energetic
Male Voices:
abhilash
- Deep and authoritativekarun
- Natural and conversationalhitesh
- Professional and engaging
Pro Tips
- Use
language="unknown"
to automatically detect the language. Great for multilingual scenarios! - Sarvam’s models understand code-mixing - your agent can naturally handle Hinglish, Tanglish, and other mixed languages.
Troubleshooting
API key errors: Check that all keys are in your .env
file and the file is in the same directory as your script.
Module not found: Run pip install livekit-agents[sarvam,openai,silero] python-dotenv
again.
Poor transcription: Try language="unknown"
for auto-detection, or specify the correct language code (en-IN
, hi-IN
, etc.).
Additional Resources
Need Help?
- Sarvam Support: support@sarvam.ai
- Community: Join the Discord Community
Happy Building!