How to control where the model stops using stop

The stop parameter lets you define one or more strings that tell the model to stop generating further tokens when it encounters them.

The stop sequence(s) will not appear in the returned text.
stop is a hard stop — the model will not generate anything past the stop string.
You can use stop to:
- Format structured outputs
- Avoid responses that run too long
- Segment multi-part answers

How it works:

You can pass:
- A single string
- Or a list of up to 4 strings
The model will stop generating as soon as any stop string is matched.

Parameter details:

Parameter	Type	Limits
`stop`	String or List of Strings	Max 4 items

When to use `stop`:

Scenario	Example `stop`
Building chat with system markers	`"###"`
Multi-turn Q&A	`"\nQ:"`
Ending output before next prompt	`"User:"`
Preventing overly long answers	`["###", "\nUser:", "End"]`

1 # Install SarvamAI
2 !pip install -Uqq sarvamai
3 from sarvamai import SarvamAI

1 # Initialize the SarvamAI client with your API key
2 client = SarvamAI(api_subscription_key="8f631181-79f7-43e4-9e7c-f78431dc4c91")

1 # Example 1: Using single stop string
2 response = client.chat.completions(
3     messages=[
4         {"role": "system", "content": "You are a helpful assistant. End your answers with ###."},
5         {"role": "user", "content": "What is the capital of France?"}
6     ],
7     stop="###"  # Stop when "###" is reached
8 )

1 # Receive assistant's reply as output.
2 print(response.choices[0].message.content)

1 # Example 2: Using list of stop strings
2 response = client.chat.completions(
3     messages=[
4         {"role": "system", "content": "You are an expert answering user questions."},
5         {"role": "user", "content": "Explain what a black hole is."}
6     ],
7     stop=["###", "\nUser:", "\nQ:"]  # Multiple stop sequences
8 )

1 # Receive assistant's reply as output.
2 print(response.choices[0].message.content)

How it works:

Parameter details:

When to use stop:

When to use `stop`: