Overview

This guide demonstrates how to convert text into speech using the Sarvam AI Text-to-Speech API. The resulting audio files are saved as .wav files.

1.Prerequisites

Before running this, ensure you have:

Python 3.7 or higher
Python packages: sarvamai

Install the required package using pip:

$ pip install sarvamai

2.Import Required Libraries

1 from sarvamai import SarvamAI
2 from sarvamai.play import play, save

3.Set Up Your API Key

To use the TTS Bulbul API:

Sign up at Sarvam AI Dashboard to get your API key.
Replace the placeholder key in the code.

1 SARVAM_API_KEY = "YOUR_SARVAM_API_KEY"
2 client = SarvamAI(api_subscription_key=SARVAM_API_KEY)

4.Example Text Input

1 text = """
2 Netaji Subhash Marg से Dayanand Road की तरफ, south की तरफ़ जाने से शुरू करें।
3 Dayanand Road पर पहुँचने के बाद, बाएँ मुड़ जाएँ। 350 meters तक सीधा चलते रहें।
4 आपको बायें तरफ़, United Bank of India ATM दिखेगा।
5 Dayanand School के दाएँ तरफ़ से गुजरने के बाद, बाएँ मुड़ें।
6 120 meters के बाद, Ghata Masjid Road पर, right turn करें।
7 280 meters तक चलते रहें।
8 Mahatma Gandhi Marg पे रहें और, 2.9 kilometers तक Old Delhi की तरफ जाएँ।
9 फिर, HC Sen Marg पर continue करें, और Paranthe Wali Gali तक drive करें।
10 """

5.API Parameters

Parameter	Description
`target_language_code`	Language of the input text (e.g., `hi-IN`)
`speaker`	Voice used: Female - `Anushka`, `Manisha`, `Vidya`, `Arya`; Male - `Abhilash`, `Karun`, `Hitesh`
`pitch`	Pitch adjustment: -0.75 to 0.75 (default: 0.0)
`pace`	Speed control: 0.5 to 2.0 (default: 1.0)
`loudness`	Volume: 0.3 to 3.0 (default: 1.0)
`speech_sample_rate`	Output sample rate: 8000, 16000, 22050, or 24000 Hz
`enable_preprocessing`	Normalize English/numeric entities (default: false)

6.Convert Text to Speech

1 response = client.text_to_speech.convert(
2     text="Your Text",
3     target_language_code="hi-IN",
4     speaker="anushka",
5     enable_preprocessing=True,
6 )

7. Play or Save Audio

To play the output:

1 play(response)

To save the output:

1 save(response, "output.wav")

8.Output

Running the above code saves a output.wav file containing the speech.

9.Conclusion

This MDX guide showed how to use Sarvam AI’s TTS API to convert Hindi text into lifelike speech. Customize the text, language, voice, and parameters to suit your application.

10.Additional Resources

Documentation: docs.sarvam.ai
Community Support: Join our Discord

🛡️ Note: Keep your API key safe and avoid committing it in public repositories.

🚀 Keep Building!