POST
/
text-to-speech

Headers

api-subscription-key
string
default:

Body

application/json
inputs
string[]
required

The text to be spoken. Current limit for text 500 characters. You can send as many as 3 texts in a single call. This can be a code-mixed script as well mixing english and indic languages.

target_language_code
enum<string>
required

The language of the text is BCP-47 format

Available options:
hi-IN,
bn-IN,
kn-IN,
ml-IN,
mr-IN,
od-IN,
pa-IN,
ta-IN,
te-IN,
en-IN,
gu-IN
speaker
enum<string> | null
default: meera

The speaker or voice to use

Available options:
meera,
pavithra,
maitreyi,
arvind,
amol,
amartya
pitch
number | null

Control the pitch of the audio. Suitable range is between -0.75 to 0.75.

Required range: -1 < x < 1
pace
number | null

Control the speed of the audio. Lower value corresponds to a slower output. Suitable range is between 0.5 to 2.

Required range: 0.3 < x < 3
loudness
number | null
default: 1

Control the loudness of the audio. Lower value corresponds to a feeble sound. Suitable range is between 0.3 to 3.0

Required range: 0 < x < 3
speech_sample_rate
enum<integer> | null
default: 22050

The sample rate of the output audio. We support 8000, 16000 and 22050. If not provided, 22050 will be used as default.

Available options:
8000,
16000,
22050
enable_preprocessing
boolean
default: false

Controls whether normalization of english words and numeric entities like english is to be done.

model
enum<string>
default: bulbul:v1

Model to be used for converting text inputs to speech

Available options:
bulbul:v1

Response

200 - application/json
audios
string[]
required

wave (.wav) file output, base64 encoded