Meta Prompt Guide
What is a Meta-Prompt?
A meta-prompt is a detailed instruction or template given to an AI, telling it how to think, act, or respond in specific scenarios. It sets the rules and context for the AI to follow consistently throughout the conversation.
Why Use a Meta-Prompt?
- Consistency
Ensures the AI understands your goals and behaves in a reliable way. - Clarity
Provides clear instructions, making the AI’s responses more accurate and relevant. - Efficiency
Saves time by reducing the need to explain the same context repeatedly. - Customization
Adjusts the AI to fit specific tasks or workflows based on your needs.
Meta-prompts are especially helpful when working on complex tasks or integrating APIs, as they align the AI’s responses with your requirements.
Meta-Prompt Usage Guide
Follow these steps to effectively use the meta-prompt with your favorite AI assistant (e.g., ChatGPT, Gemini, or similar tools):
Step 1: Load the Meta-Prompt
Copy and paste the meta-prompt into the AI assistant’s input field. This will provide the AI with the necessary context to help you use Sarvam’s APIs effectively.
Step 2: Provide Context for Your Use Case
In the next message, let the AI know that the meta-prompt should be taken as context for assisting you in building projects with Sarvam’s APIs. You can use the following example:
“Hey, you have to take the above meta-prompt as your context and help me build things using Sarvam’s API. I will provide the details in further prompts.”
Step 3: Share Your Specific Requirement
In subsequent messages, provide the specific details of the project or task you want to build. For example, if you want to create a translator app, you can say:
“I want to build a translator app that can translate English to Kannada. Please help me implement this using Sarvam’s API.”
With this implementation, you can input any English text, and the app will return the translated text in Kannada. Ensure you replace "your-api-key" with your actual Sarvam API subscription key.
By following this guide, you can seamlessly use the meta-prompt to leverage Sarvam’s APIs for building various projects.
Sarvam AI Meta Prompt
Note:- This meta-prompt is designed for Large Language Models (LLMs) like ChatGPT or Gemini, not for human users. It provides context and instructions to guide the AI in assisting with tasks using Sarvam’s API effectively.
-
Overview of all Sarvam AI APIs (via SDK):
-
Translate Text: Use client.text.translate() to translate text between 23 supported languages (en-IN and 22 Indic languages), supporting both formal and code-mixed translation styles.
-
Identify Language: Use client.text.identify_language() to detect the language of input text from 11 supported languages (en-IN and 10 Indic languages).
-
Transliterate: Use client.text.transliterate() to convert text from one script to another (e.g., Devanagari to Latin script) for 11 supported languages.
-
Speech to Text: Use client.speech_to_text.transcribe() to convert spoken language into written text with multiple output modes, supporting 12 languages (transcribe, verbatim, translit, codemix).
-
Speech to Text Translate: Use client.speech_to_text.translate() to convert spoken language into translated text, automatically detecting the language and outputting in English for 11 supported languages.
-
Text to Speech: Use client.text_to_speech.convert() to convert text into spoken words with advanced voice options for 11 supported languages (10 Indian + English), supporting multiple models (bulbul:v2, bulbul:v3).
-
Chat Completion: Use client.chat.completions() to generate responses from Sarvam’s LLM (sarvam-m) for conversational or generative AI tasks with reasoning capabilities and wiki grounding.
-
-
Sarvam AI’s API Overview:
- Language Code Options: Language support varies by API:
- Translation: 23 languages - en-IN (English) and 22 Indic languages: hi-IN, bn-IN, kn-IN, ml-IN, mr-IN, od-IN, pa-IN, ta-IN, te-IN, gu-IN, as-IN, brx-IN, doi-IN, kok-IN, ks-IN, mai-IN, mni-IN, ne-IN, sa-IN, sat-IN, sd-IN, ur-IN
- Speech-to-Text: 12 languages - hi-IN, bn-IN, kn-IN, ml-IN, mr-IN, od-IN, pa-IN, ta-IN, te-IN, en-IN, gu-IN (plus ‘unknown’ for auto-detection)
- Text-to-Speech, Transliterate, Language Identification: 11 languages (10 Indian + English) - bn-IN, en-IN, gu-IN, hi-IN, kn-IN, ml-IN, mr-IN, od-IN, pa-IN, ta-IN, te-IN
- Language Code Options: Language support varies by API:
-
Translate API:
- Module: ‘client.text.translate()’
- Purpose: Translates text from a source language to a target language with additional customization options.
- Method: SDK function call
- Authorization: Uses API key stored in ‘SARVAM_API_KEY’ environment variable via SarvamAI SDK.
-
Request Parameters:
- input(required):
- The text to be translated.
- Must be a valid string.
- source_language_code(required):
- Language code of the source text.
- Supported:
en-IN.
- target_language_code(required):
- Language code of the target text.
- Supported:
en-IN,hi-IN,bn-IN,kn-IN,ml-IN,mr-IN,od-IN,pa-IN,ta-IN,te-IN,gu-IN,as-IN,brx-IN,doi-IN,kok-IN,ks-IN,mai-IN,mni-IN,ne-IN,sa-IN,sat-IN,sd-IN,ur-IN.
- speaker_gender(optional, default: Female):
- Specify the gender of the speaker for code-mixed translation models.
- Options:
Male,Female.
- mode(optional, default: formal):
- Defines the translation style.
- Options:
formal,code-mixed.
- model(optional, default: mayura:v1):
- Translation model to be used.
- Options:
mayura:v1,sarvam-translate:v1.
- input(required):
-
Response:
Returns a JSON object containing the translated text.
- Example Code:
-
Supported Languages:
- Translation API (23 languages): English (
en-IN), Hindi (hi-IN), Bengali (bn-IN), Kannada (kn-IN), Malayalam (ml-IN), Marathi (mr-IN), Odia (od-IN), Punjabi (pa-IN), Tamil (ta-IN), Telugu (te-IN), Gujarati (gu-IN), Assamese (as-IN), Bodo (brx-IN), Dogri (doi-IN), Konkani (kok-IN), Kashmiri (ks-IN), Maithili (mai-IN), Manipuri (mni-IN), Nepali (ne-IN), Sanskrit (sa-IN), Santali (sat-IN), Sindhi (sd-IN), Urdu (ur-IN).
- Translation API (23 languages): English (
-
Speech to Text:
- Module:
client.speech_to_text.transcribe() - Purpose: Convert speech (audio file) into text in the specified language.
- Behavior: Converting spoken language from an audio file to written text in languages like Hindi and others.
- Method: SDK function call
- Authorization: Uses API key stored in
SARVAM_API_KEYenvironment variable via SarvamAI SDK.
- Module:
-
Request Body Schema:
-
language_code: Specifies the language of the speech input (e.g.,"hi-IN"for Hindi). -
model: Specifies the model version for speech-to-text conversion. Recommended:"saaras:v3"(advanced with multiple modes). -
mode: Output format mode (only for saaras:v3). Options:transcribe(default),translate,verbatim,translit,codemix. -
with_diarization: Boolean flag for speaker identification (beta feature). -
with_timestamps: Boolean flag indicating if timestamps should be included in the output (TrueorFalse). -
file: The audio file to transcribe. Supported formats:.wav.mp3- Works best at 16kHz.
- Multiple channels will be merged.
- STT (Streaming): The input audio must be sent as a base64-encoded string (PCM/WAV data encoded in base64).
-
-
Example Code:
-
Supported File Formats:
.wav(recommended at 16kHz).mp3(recommended at 16kHz)- The API will merge multiple audio channels.
-Example Response: The response will contain the converted text in JSON format, typically without timestamps unless specified.
-
Speech to Text Translate API:
- Module:
client.speech_to_text.translate() - Purpose: Combine speech recognition and translation to detect the spoken language and returns the transcript and the BCP-47 code of the most predominant language.
- Best for: Detecting the language from spoken input and returning the transcript in English and the corresponding BCP-47 language code.
- Method: SDK function call
- Authorization: Uses API key stored in
SARVAM_API_KEYenvironment variable via SarvamAI SDK.
- Module:
-
Request Body Schema:
file: The path to the speech input (audio file) in which the language needs to be detected.model: Specifies the model version for speech-to-text and translation (e.g., “saaras:v3” withmode="translate").
- Example Code:
-
Response:
- The API will return the BCP-47 language code of the language spoken in the input (e.g.,
hi-INfor Hindi,en-USfor English). - transcript: The transcribed and translated text in English.
- If multiple languages are detected, it will return the code for the most predominant language.
- If no language is detected, the response will be
null.
- The API will return the BCP-47 language code of the language spoken in the input (e.g.,
-
Example Response:
- Supported Language Codes:
The language codes returned will follow the BCP-47 standard for various Indic and English languages, such as:
hi-IN,en-US,pa-IN,ta-IN, etc.
-
Text to Speech API:
- Purpose: Convert written text into spoken words using a specified voice and various customization options.
- Best for: Generating speech from text with configurable attributes like pace, ideal for creating custom audio outputs in multiple languages.
- Method: SDK-based function call
- Authorization: API key required via
SARVAM_API_KEYenvironment variable.
-
Parameters:
text: String to be converted to speech (max 1500 chars for bulbul:v2, max 2500 chars for bulbul:v3-beta).target_language_code: The language code for the output language (supports 11 languages - 10 Indian + English:bn-IN,en-IN,gu-IN,hi-IN,kn-IN,ml-IN,mr-IN,od-IN,pa-IN,ta-IN,te-IN).model: Model version to use. Options:"bulbul:v2"(default),"bulbul:v3-beta"(advanced).speaker: Voice selection (model-specific):- bulbul:v2:
"anushka","manisha","vidya","arya","abhilash","karun","hitesh" - bulbul:v3-beta:
"aditya","ritu","priya","neha","rahul","pooja", and many more
- bulbul:v2:
pitch: Number controlling pitch (-0.75to0.75). Only for bulbul:v2. Default:0.pace: Speed control. bulbul:v2: (0.3to3), bulbul:v3-beta: (0.5to2.0). Default:1.0.loudness: Volume control (0to3). Only for bulbul:v2. Default:1.0.speech_sample_rate: Audio sample rate. bulbul:v2:22050, bulbul:v3-beta:24000.enable_preprocessing: Preprocessing of English/numeric entities. bulbul:v2: optional, bulbul:v3-beta: always enabled.
- Example Code:
-
Response:
The function returns the synthesized audio content (e.g., in WAV format). You can write it directly to a file as shown above. TTS: The output audio is returned as a base64-encoded string and must be decoded before playback or saving to a file.
-
Identify Language API:
- Module:
client.text.identify_language() - Purpose: Automatically detect the language of input text from 11 supported languages (
en-IN,hi-IN,bn-IN,gu-IN,kn-IN,ml-IN,mr-IN,od-IN,pa-IN,ta-IN,te-IN). - Best for: Language detection before translation or processing multilingual content.
- Method: SDK function call
- Authorization: Uses API key stored in
SARVAM_API_KEYenvironment variable via SarvamAI SDK.
- Module:
-
Parameters:
input(required): The text whose language needs to be identified.
-
Example Code:
-
Transliterate API:
- Module:
client.text.transliterate() - Purpose: Convert text from one script to another while preserving pronunciation.
- Best for: Converting between scripts (e.g., Devanagari to Latin, Latin to native scripts).
- Method: SDK function call
- Authorization: Uses API key stored in
SARVAM_API_KEYenvironment variable via SarvamAI SDK.
- Module:
-
Parameters:
input(required): The text to be transliterated.source_language_code(required): Source language code.target_language_code(required): Target language code.
-
Example Code:
-
Chat Completion API:
- Purpose: Generate conversational or instructional responses using Sarvam’s chat model.
- Best for: Building interactive AI assistants, chatbots, or instruction-following agents that understand and respond in natural language.
- Method: SDK function call.
- Authorization: API key required via
SARVAM_API_KEYenvironment variable.
-
Parameters:
messages(required): A list of message objects that form the conversation. Each message must include:role: One of"system","user", or"assistant".content: The message text.
model(optional, default: “sarvam-m”): Name of the chat model to use.temperature(optional): Controls randomness of output. Default is0.2.reasoning_effort(optional): Depth of reasoning. Options:"low","medium","high".wiki_grounding(optional): Boolean. Enables retrieval from Wikipedia for factual answers. Default:false.top_p(optional): Controls nucleus sampling. Default is1.0.max_tokens(optional): Limits the length of the generated response.stream(optional): Boolean. Set toTrueto receive partial responses in real-time.
- Example Code:
-
Response:
-
Returns a JSON object with:
-
choices: List of generated message completions. Each contains:message: Includes the assistant’s response in"content"and the"role"set to"assistant".finish_reason: Indicates why the response was completed (e.g.,"stop").
-
usage(optional): Includes token usage stats (prompt_tokens,completion_tokens,total_tokens).
-
-
-
Integration Guidelines
- Ensure secure SDK communication by using environment variables like
SARVAM_API_KEYto avoid exposing keys in code. - Handle rate limiting errors using SDK response codes or retry logic to maintain stability.
- Always validate input data (e.g., text, language codes, file formats) before invoking SDK methods.
- Wrap all SDK calls in
try-exceptblocks to gracefully handle errors and debug effectively. - Set up authentication using environment variables or secrets managers to securely pass API keys.
- Implement logging around SDK usage (input, output, exceptions) for monitoring and diagnostics.
- Ensure SDK supports versioning, and use specific versions of models (e.g., “bulbul:v2”, “sarvam-m”) in your calls to maintain backward compatibility.
- Follow consistent parameter naming when creating payloads for SDK functions to improve maintainability.
- Cache static responses (like language code lookups) to reduce redundant SDK calls and improve performance.
- Regularly audit SDK integration and logs to detect latency, usage anomalies, or potential security flaws.
- Ensure secure SDK communication by using environment variables like
-
Tips for Responding to User Requests
- Analyze the task to determine which SDK modules (e.g.,
speech_to_text,chat_completion) are needed. - If multiple SDK components are required, outline the purpose of each:
speech_to_text: Converts audio to transcript.text_analytics: Extracts answers from text.
- For each API module, define separate SDK wrapper functions:
- Keep code modular and reusable.
- Handle input validation and output parsing within each function.
- Correctly parse SDK responses:
- For Speech-to-Text:
transcript = response["transcript"]
- For Text Analytics:
answers = response["answers"]
- For Speech-to-Text:
- Write a main script that:
- Accepts or loads user input.
- Invokes the SDK function(s) as needed.
- Saves, displays, or returns the output.
- Analyze the task to determine which SDK modules (e.g.,
We hope this guide helps you get started with Sarvam’s API! If you encounter any issues or have questions along the way, feel free to reach out to us on our Discord. Our community is ready to assist you in any way possible!