Transliterate API : A Hands-on Guide
🔗 Overview
This tutorial demonstrates how to use the Transliteration API to convert text from one script to another while preserving pronunciation. It supports multiple Indic languages and offers customizable numeral formatting.
1. Installation
Before you begin, ensure you have the necessary Python libraries installed. Run the following commands to install the required packages:
Import Required Libraries
First, let’s import all the necessary libraries.
2. Authentication
To use the API, you need an API subscription key. Follow these steps to set up your API key:
- Obtain your API key: If you don’t have an API key, sign up on the Sarvam AI Dashboard to get one.
- Replace the placeholder key: In the code below, replace “YOUR_SARVAM_AI_API_KEY” with your actual API key.
3. Understanding the Parameters
🔹 The API takes several key parameters:
✔ input
– The text to be transliterated.
✔ source_language_code
– Language of the input text.
✔ target_language_code
– Desired transliteration output language.
✔ numerals_format
– Choose between international (0-9) or native (१-९) numbers.
✔ spoken_form
– Whether to convert text into a natural spoken format.
✔ spoken_form_numerals_language
– Choose whether numbers should be spoken in English or native language.
🚫 Note: Transliteration between Indic languages (e.g., Hindi → Bengali) is not supported.
4. Basic Usage
4.1: Read the Document
We have two sample documents under the data
folder:
4.2: Split the text into chunks of at most 1000 characters
Since the API has a restriction of 1000 characters per request, we need to split the text accordingly.
4.3: Setting up the API Endpoint
5. Experimenting with Different Options
We currently have three different transliteration models:
5.1 Romanization (Indic → Latin Script)
- Converts Indic scripts to Roman script (English alphabet).
- Example:
मैं ऑफिस जा रहा हूँ
→main office ja raha hun
- Parameters:
source_language_code = "hi-IN"
target_language_code = "en-IN"
Romanized Text: Main office ja raha hun
5.2 Conversion to Indic Scripts
-
Converts text into an Indic script from various sources:
-
Code-mixed text
- Example:
मैं office जा रहा हूँ
→मैं ऑफिस जा रहा हूँ
- Parameters:
source_language_code = "hi-IN"
target_language_code = "hi-IN"
- Example:
-
Romanized text
- Example:
main office ja raha hun
→मैं ऑफिस जा रहा हूँ
- Parameters:
source_language_code = "hi-IN"
target_language_code = "hi-IN"
- Example:
-
English text
- Example:
I am going to office
→आइ ऍम गोइंग टू ऑफिस
- Parameters:
source_language_code = "en-IN"
target_language_code = "hi-IN"
- Example:
-
Transliterated Text: मैं ऑफिस जा रहा हूँ
5.3 Spoken Indic Form
- Converts written text into a more natural spoken form.
- Example:
मुझे कल 9:30am को appointment है
→मुझे कल सुबह साढ़े नौ बजे अपॉइंटमेंट है
Spoken Text: मुझे कल सुबह साढ़े नौ बजे अपॉइंटमेंट है
6. Advanced Features
numerals_format
– Choose between international (0-9) or native (१-९) numbers.spoken_form_numerals_language
– Choose whether numbers should be spoken in English or the native language.
Numerals Format
numerals_format
is an optional parameter with two options:
international
(default): Uses regular numerals (0-9).native
: Uses language-specific native numerals.
Example:
- If
international
format is selected →मेरा phone number है: 9840950950
. - If
native
format is selected →मेरा phone number है: ९८४०९५०९५०
.
Native Numerals Text: मुझे कल सुबह साढ़े नौ बजे अपॉइंटमेंट है
Spoken Form Numerals Language
spoken_form_numerals_language
is an optional parameter with two options and only works when spoken_form
is true:
english
: Numbers in the text will be spoken in English.native (default)
: Numbers in the text will be spoken in the native language.
Example:
Input: "मेरे पास ₹200 है"
- If
english
format is selected →"मेरे पास टू हन्डर्ड रूपीस है"
. - If
native
format is selected →"मेरे पास दो सौ रुपये है"
.
Spoken Form Numerals Language Text: मुझे कल नाइन थर्टी ए एम को अपॉइंटमेंट है
7. Error Handling
You may encounter these errors while using the API:
-
403 Forbidden (
invalid_api_key_error
)- Cause: Invalid API key.
- Solution: Use a valid API key from the Sarvam AI Dashboard.
-
429 Too Many Requests (
insufficient_quota_error
)- Cause: Exceeded API quota.
- Solution: Check your usage, upgrade if needed, or implement exponential backoff when retrying.
-
500 Internal Server Error (
internal_server_error
)- Cause: Issue on our servers.
- Solution: Try again later. If persistent, contact support.
-
400 Bad Request (
invalid_request_error
)- Cause: Incorrect request formatting.
- Solution: Verify your request structure and parameters.
8. Additional Resources
For more details, refer to the our official documentation and we are always there to support and help you on our Discord Server:
- Documentation: docs.sarvam.ai
- Community: Join the Discord Community
9. Final Notes
- Keep your API key secure.
- Use clear audio for best results.
- Explore advanced features like diarization and translation.
Keep Building! 🚀