> For clean Markdown of any page, append `.md` to the page URL.
> For a complete documentation index, see https://docs.sarvam.ai/llms.txt.
> For full documentation content in one file, see https://docs.sarvam.ai/llms-full.txt.
> For AI client integration (Claude Code, Cursor, etc.), connect to the MCP server at https://docs.sarvam.ai/_mcp/server.

# How to set the audio format for output using `output_audio_codec`

> Choose the audio format for TTS streaming output.

The `output_audio_codec` parameter defines the **audio format** for the streamed speech output. It must be set in the `config` message before sending any text.

If not specified, the audio is streamed in base64-encoded MPEG format by default.

Choosing the appropriate codec can impact:

* **Audio quality**
* **File size**
* **Playback compatibility**
* **Latency**

### **Supported Audio Codecs**

| Codec      | Description                                            |
| ---------- | ------------------------------------------------------ |
| `mp3`      | MPEG Layer-3 – widely supported, good compression      |
| `aac`      | Advanced Audio Coding – good compression, high quality |
| `alaw`     | 8-bit logarithmic PCM – used in telephony              |
| `flac`     | Lossless format – high fidelity audio                  |
| `linear16` | Uncompressed PCM audio – large size, accurate          |
| `mulaw`    | Similar to alaw, used in telephony                     |
| `opus`     | Optimized for speech and streaming                     |
| `wav`      | Standard uncompressed format, large files              |

***

### Example Code

```python
from sarvamai import SarvamAI
from sarvamai.play import save

# Initialize the REST client
client = SarvamAI(api_subscription_key="YOUR_SARVAM_API_KEY")

# Generate speech using REST
audio = client.text_to_speech.convert(
    text="Welcome to Sarvam AI!",
    model="bulbul:v3",
    target_language_code="en-IN",
    output_audio_codec="aac"
)
save(audio, "output1.aac")

```

```python
import asyncio
import base64
from sarvamai import AsyncSarvamAI, AudioOutput
import websockets

async def tts_stream():
    client = AsyncSarvamAI(api_subscription_key="YOUR_SARVAM_API_KEY")

    async with client.text_to_speech_streaming.connect(model="bulbul:v3") as ws:
        await ws.configure(
            target_language_code="hi-IN", 
            speaker="shubh",
            output_audio_codec = "aac"
        )
        print("Sent configuration")

        text = (
            "भारत की संस्कृति विश्व की सबसे प्राचीन और समृद्ध संस्कृतियों में से एक है।"
            "यह विविधता, सहिष्णुता और परंपराओं का अद्भुत संगम है, "
            "जिसमें विभिन्न धर्म, भाषाएं, त्योहार, संगीत, नृत्य, वास्तुकला और जीवनशैली शामिल हैं।"
        )
        await ws.convert(text)
        print("Sent text message")

        await ws.flush()
        print("Flushed buffer")

        chunk_count = 0
        with open("output.aac", "wb") as f:
            async for message in ws:
                if isinstance(message, AudioOutput):
                    chunk_count += 1
                    audio_chunk = base64.b64decode(message.data.audio)
                    f.write(audio_chunk)
                    f.flush()

        print(f"All {chunk_count} chunks saved to output.aac")
        print("Audio generation complete")

        
        if hasattr(ws, "_websocket") and not ws._websocket.closed:
            await ws._websocket.close()
            print("WebSocket connection closed.")


if __name__ == "__main__":
    asyncio.run(tts_stream())

# --- Notebook/Colab usage ---
# await tts_stream()

```