How to control response diversity with top_p

The top_p parameter controls how much of the probability space the model uses when selecting the next word — this is called nucleus sampling.

Range: 0 to 1
Default: 1.0

Lower top_p → model chooses from a smaller set of highly likely words → more focused
Higher top_p → model chooses from a broader set of words → more diverse

When to use:

`top_p` value	Behavior
`0.1`	Very focused, only top 10% words used
`0.3`	Controlled diversity
`0.5`	Balanced creativity and accuracy
`0.8 - 1.0`	Very creative, open-ended responses
`1.0` (default)	Full probability space used

1 # Install SarvamAI
2 !pip install -Uqq sarvamai
3 from sarvamai import SarvamAI

1 # Initialize the SarvamAI client with your API key
2 client = SarvamAI(api_subscription_key="YOUR_SARVAM_API_KEY")

1 # Example 1: Using default top_p (1.0) — full probability space (diverse response)
2 response = client.chat.completions(
3     messages=[
4         {"role": "system", "content": "You are a helpful assistant."},
5         {"role": "user", "content": "What is the capital of France?"}
6     ]
7     # top_p is not specified → uses default 1.0
8 )

1 # Example 2: Using top_p = 0.3 — more focused, controlled response
2 response = client.chat.completions(
3     messages=[
4         {"role": "system", "content": "You are a creative storyteller."},
5         {"role": "user", "content": "Tell me a story about a magical tiger."}
6     ]
7     top_p=0.3
8 )

1 # Receive assistant's reply as output.
2 print(response.choices[0].message.content)