Text to Speech
This is the model to convert text into spoken audio. The output is a wave file encoded as a base64 string.
Headers
Your subscription key
Body
The text to be spoken. Current limit for text 500 characters. You can send as many as 3 texts in a single call. This can be a code-mixed script as well mixing english and indic languages.
The language of the text is BCP-47 format
hi-IN
, bn-IN
, kn-IN
, ml-IN
, mr-IN
, od-IN
, pa-IN
, ta-IN
, te-IN
, en-IN
, gu-IN
The speaker or voice to use
meera
, pavithra
, maitreyi
, arvind
, amol
, amartya
Control the pitch of the audio. Suitable range is between -0.75 to 0.75.
-1 < x < 1
Control the speed of the audio. Lower value corresponds to a slower output. Suitable range is between 0.5 to 2.
0.3 < x < 3
Control the loudness of the audio. Lower value corresponds to a feeble sound. Suitable range is between 0.3 to 3.0
0 < x < 3
The sample rate of the output audio. We support 8000, 16000 and 22050. If not provided, 22050 will be used as default.
8000
, 16000
, 22050
Controls whether normalization of english words and numeric entities like english is to be done.
Model to be used for converting text inputs to speech
bulbul:v1
Response
wave (.wav) file output, base64 encoded