Tutor Agent using Pipecat

Overview

This guide demonstrates how to build a voice-based tutor agent that can teach, explain concepts, and help students with various subjects using Pipecat for real-time communication and Sarvam AI for speech processing. Perfect for EdTech platforms, online tutoring, and educational applications serving Indian students.

What You’ll Build

A tutor agent that can:

  • Explain concepts in simple, student-friendly language
  • Help students solve problems step by step
  • Answer questions across various subjects
  • Adapt explanations to the student’s level of understanding
  • Communicate in multiple Indian languages

Quick Overview

  1. Get API keys (Sarvam, OpenAI)
  2. Install packages
  3. Create .env file with your API keys
  4. Write the agent code
  5. Run with appropriate transport

Quick Start

1. Prerequisites

  • Python 3.9 or higher
  • API keys from:

2. Install Dependencies

$pip install "pipecat-ai[daily,openai]" python-dotenv loguru

3. Create Environment File

Create a file named .env in your project folder and add your API keys:

1SARVAM_API_KEY=sk_xxxxxxxxxxxxxxxxxxxxxxxx
2OPENAI_API_KEY=sk-proj-xxxxxxxxxxxxxxxx

Replace the values with your actual API keys.

4. Write Your Agent

Create tutor_agent.py:

1import os
2from dotenv import load_dotenv
3from loguru import logger
4from pipecat.frames.frames import LLMRunFrame
5from pipecat.pipeline.pipeline import Pipeline
6from pipecat.pipeline.runner import PipelineRunner
7from pipecat.pipeline.task import PipelineTask
8from pipecat.processors.aggregators.llm_context import LLMContext
9from pipecat.processors.aggregators.llm_response_universal import (
10 LLMContextAggregatorPair,
11)
12from pipecat.runner.types import RunnerArguments
13from pipecat.runner.utils import create_transport
14from pipecat.services.sarvam.stt import SarvamSTTService
15from pipecat.services.sarvam.tts import SarvamTTSService
16from pipecat.services.openai.llm import OpenAILLMService
17from pipecat.transports.base_transport import TransportParams
18from pipecat.transports.daily.transport import DailyParams
19
20load_dotenv(override=True)
21
22async def bot(runner_args: RunnerArguments):
23 """Main bot entry point."""
24
25 # Create transport (supports both Daily and WebRTC)
26 transport = await create_transport(
27 runner_args,
28 {
29 "daily": lambda: DailyParams(audio_in_enabled=True, audio_out_enabled=True),
30 "webrtc": lambda: TransportParams(
31 audio_in_enabled=True, audio_out_enabled=True
32 ),
33 },
34 )
35
36 # Initialize AI services
37 stt = SarvamSTTService(
38 api_key=os.getenv("SARVAM_API_KEY"),
39 language="unknown", # Auto-detect for multilingual students
40 model="saaras:v3",
41 mode="transcribe"
42 )
43
44 tts = SarvamTTSService(
45 api_key=os.getenv("SARVAM_API_KEY"),
46 target_language_code="en-IN",
47 model="bulbul:v3",
48 speaker="ishita" # Clear and articulate voice for teaching
49 )
50
51 llm = OpenAILLMService(api_key=os.getenv("OPENAI_API_KEY"), model="gpt-4o")
52
53 # Set up conversation context with tutor personality
54 messages = [
55 {
56 "role": "system",
57 "content": """You are an expert tutor designed to help students understand and excel in their studies.
58
59Your teaching expertise covers multiple subjects:
60
61**Mathematics:**
62- Arithmetic, Algebra, Geometry, Trigonometry
63- Calculus, Statistics, Probability
64- Problem-solving techniques
65
66**Science:**
67- Physics: Mechanics, Electricity, Optics, Thermodynamics
68- Chemistry: Elements, Reactions, Organic Chemistry
69- Biology: Cell Biology, Human Anatomy, Ecology
70
71**Languages:**
72- English Grammar and Composition
73- Hindi Grammar and Literature
74- Reading Comprehension
75
76**Social Studies:**
77- History, Geography, Civics
78- Economics basics
79
80Teaching approach:
81- Start with the basics and build up to complex concepts
82- Use real-world examples and analogies to explain abstract concepts
83- Break down complex problems into smaller, manageable steps
84- Encourage students and praise their efforts
85- Ask questions to check understanding
86- Adapt your explanations based on the student's level
87- Use simple language and avoid overwhelming with jargon
88- When solving numerical problems, show each step clearly
89
90Communication style:
91- Be patient, encouraging, and supportive
92- Speak clearly and at a moderate pace
93- Celebrate small victories and correct mistakes gently
94- If a student is struggling, try a different explanation approach
95- Make learning interesting by connecting it to everyday life
96
97Start by greeting the student warmly and asking what subject or topic they'd like to learn or what problem they need help with.""",
98 },
99 ]
100 context = LLMContext(messages)
101 context_aggregator = LLMContextAggregatorPair(context)
102
103 # Build pipeline
104 pipeline = Pipeline(
105 [
106 transport.input(),
107 stt,
108 context_aggregator.user(),
109 llm,
110 tts,
111 transport.output(),
112 context_aggregator.assistant(),
113 ]
114 )
115
116 task = PipelineTask(pipeline)
117
118 @transport.event_handler("on_client_connected")
119 async def on_client_connected(transport, client):
120 logger.info("Student connected")
121 messages.append(
122 {"role": "system", "content": "Greet the student warmly and ask what subject or topic they'd like to learn today."}
123 )
124 await task.queue_frames([LLMRunFrame()])
125
126 @transport.event_handler("on_client_disconnected")
127 async def on_client_disconnected(transport, client):
128 logger.info("Student disconnected")
129 await task.cancel()
130
131 runner = PipelineRunner(handle_sigint=runner_args.handle_sigint)
132 await runner.run(task)
133
134if __name__ == "__main__":
135 from pipecat.runner.run import main
136 main()

5. Run Your Agent

$python tutor_agent.py

The agent will create a Daily room and provide you with a URL to join.

6. Test Your Agent

Open the provided Daily room URL in your browser and start speaking. Your tutor will listen and respond!


Customization Examples

Example 1: Hindi Tutor

For Hindi-medium students:

1stt = SarvamSTTService(
2 api_key=os.getenv("SARVAM_API_KEY"),
3 language="hi-IN", # Hindi
4 model="saaras:v3",
5 mode="transcribe"
6)
7
8tts = SarvamTTSService(
9 api_key=os.getenv("SARVAM_API_KEY"),
10 target_language_code="hi-IN",
11 model="bulbul:v3",
12 speaker="simran" # Warm and friendly teacher voice
13)
14
15llm = OpenAILLMService(api_key=os.getenv("OPENAI_API_KEY"), model="gpt-4o")

Example 2: Tamil Tutor

1stt = SarvamSTTService(
2 api_key=os.getenv("SARVAM_API_KEY"),
3 language="ta-IN",
4 model="saaras:v3",
5 mode="transcribe"
6)
7
8tts = SarvamTTSService(
9 api_key=os.getenv("SARVAM_API_KEY"),
10 target_language_code="ta-IN",
11 model="bulbul:v3",
12 speaker="ishita"
13)
14
15llm = OpenAILLMService(api_key=os.getenv("OPENAI_API_KEY"), model="gpt-4o")

Example 3: Multilingual Tutor (Auto-detect)

For diverse student populations:

1stt = SarvamSTTService(
2 api_key=os.getenv("SARVAM_API_KEY"),
3 language="unknown", # Auto-detects language
4 model="saaras:v3",
5 mode="transcribe"
6)
7
8tts = SarvamTTSService(
9 api_key=os.getenv("SARVAM_API_KEY"),
10 target_language_code="en-IN",
11 model="bulbul:v3",
12 speaker="ishita"
13)
14
15llm = OpenAILLMService(api_key=os.getenv("OPENAI_API_KEY"), model="gpt-4o")

Example 4: Speech-to-English Tutor (Saaras)

When students speak in regional languages but you want English processing:

1# Student speaks Hindi/Tamil/etc. → Saaras converts to English → LLM processes
2
3stt = SarvamSTTService(
4 api_key=os.getenv("SARVAM_API_KEY"),
5 model="saaras:v3", # Speech-to-English translation
6 mode="translate"
7)
8
9tts = SarvamTTSService(
10 api_key=os.getenv("SARVAM_API_KEY"),
11 target_language_code="en-IN",
12 model="bulbul:v3",
13 speaker="ishita"
14)
15
16llm = OpenAILLMService(api_key=os.getenv("OPENAI_API_KEY"), model="gpt-4o")

Available Options

Language Codes

LanguageCode
English (India)en-IN
Hindihi-IN
Bengalibn-IN
Tamilta-IN
Telugute-IN
Gujaratigu-IN
Kannadakn-IN
Malayalamml-IN
Marathimr-IN
Punjabipa-IN
Odiaod-IN
Auto-detectunknown

Speaker Voices (Bulbul v3)

Male (23): Shubh (default), Aditya, Rahul, Rohan, Amit, Dev, Ratan, Varun, Manan, Sumit, Kabir, Aayan, Ashutosh, Advait, Anand, Tarun, Sunny, Mani, Gokul, Vijay, Mohit, Rehan, Soham

Female (16): Ritu, Priya, Neha, Pooja, Simran, Kavya, Ishita, Shreya, Roopa, Amelia, Sophia, Tanya, Shruti, Suhani, Kavitha, Rupali

TTS Additional Parameters

Customize the voice for better teaching experience:

1tts = SarvamTTSService(
2 api_key=os.getenv("SARVAM_API_KEY"),
3 target_language_code="en-IN",
4 model="bulbul:v3",
5 speaker="ishita",
6 pace=0.9, # Slightly slower for better understanding
7 speech_sample_rate=24000 # 8000, 16000, 22050, 24000 Hz (default). v3 REST API also supports 32000, 44100, 48000 Hz
8)

Understanding the Pipeline

Pipecat uses a pipeline architecture where data flows through a series of processors:

Student Audio → STT → Context Aggregator → LLM → TTS → Audio Output
  1. Transport Input: Receives audio from the student
  2. STT (Speech-to-Text): Converts audio to text using Sarvam’s Saaras v3 (transcription via mode="transcribe", or translation to English via mode="translate")
  3. Context Aggregator (User): Adds student’s question to conversation context
  4. LLM: Generates educational response using OpenAI
  5. TTS (Text-to-Speech): Converts response to audio using Sarvam’s Bulbul
  6. Transport Output: Sends audio back to the student
  7. Context Aggregator (Assistant): Saves tutor’s response to context

Pro Tips

  • Use language="unknown" to support students who code-mix (Hinglish, Tanglish, etc.)
  • Use a clear, articulate voice like ishita for teaching
  • Set a slightly slower pace (0.9) for complex explanations
  • Use gpt-4o for better reasoning on complex problems
  • Encourage students to ask follow-up questions

Troubleshooting

API key errors: Check that all keys are in your .env file and the file is in the same directory as your script.

Module not found: Run the installation command again based on your operating system.

Poor transcription: Try language="unknown" for auto-detection, or specify the correct language code.

Connection issues: Ensure you have a stable internet connection and the transport is properly configured.


Additional Resources


Need Help?


Happy Building!