For AI agents: a documentation index is available at the root level at /llms.txt and /llms-full.txt. Append /llms.txt to any URL for a page-level index, or .md for the markdown version of any page.
CommunityAPI StatusAPI PricingSign Up
DocumentationAPI ReferencesCookbookIntegrationDeveloper Tools
DocumentationAPI ReferencesCookbookIntegrationDeveloper Tools
  • Getting Started
    • Welcome
    • Quickstart
    • SDKs & Libraries
    • Building for Indian Languages
    • Models
    • Credits & Rate Limits
    • Errors & Troubleshooting
    • Talk to us
    • Pricing
    • Changelog
  • API Guides & Tutorials
      • Overview
        • List your chat messages
        • Control response randomness
        • Control response diversity
        • Adjust the model's thinking level
        • Improve response factual accuracy
        • Encourage new topics in response
        • Reduce repetition words or phrases in response
        • Get repeatable results
        • Control the response length
        • Control where the model stops
LogoLogo
CommunityAPI StatusAPI PricingSign Up
On this page
  • How it works:
  • Parameter details:
  • When to use stop:
API Guides & TutorialsChat CompletionHow-to

How to control where the model stops using stop

||View as Markdown|
Was this page helpful?
Previous

Speech-to-Text APIs

Next
Built with

The stop parameter lets you define one or more strings that tell the model to stop generating further tokens when it encounters them.

  • The stop sequence(s) will not appear in the returned text.
  • stop is a hard stop — the model will not generate anything past the stop string.
  • You can use stop to:
    • Format structured outputs
    • Avoid responses that run too long
    • Segment multi-part answers

How it works:

  • You can pass:
    • A single string
    • Or a list of up to 4 strings
  • The model will stop generating as soon as any stop string is matched.

Parameter details:

ParameterTypeLimits
stopString or List of StringsMax 4 items

When to use stop:

ScenarioExample stop
Building chat with system markers"###"
Multi-turn Q&A"\nQ:"
Ending output before next prompt"User:"
Preventing overly long answers["###", "\nUser:", "End"]

First, install the SDK:

$pip install -Uqq sarvamai

Then use the following Python code:

1from sarvamai import SarvamAI
2
3# Initialize the SarvamAI client with your API key
4client = SarvamAI(api_subscription_key="YOUR_SARVAM_API_KEY")
5
6# Example 1: Using single stop string
7response = client.chat.completions(
8 model="sarvam-105b",
9 messages=[
10 {"role": "system", "content": "You are a helpful assistant. End your answers with ###."},
11 {"role": "user", "content": "What is the capital of France?"}
12 ],
13 stop="###" # Stop when "###" is reached
14)
15
16# Receive assistant's reply as output
17print(response.choices[0].message.content)
1from sarvamai import SarvamAI
2
3client = SarvamAI(api_subscription_key="YOUR_SARVAM_API_KEY")
4
5# Example 2: Using list of stop strings
6response = client.chat.completions(
7 model="sarvam-105b",
8 messages=[
9 {"role": "system", "content": "You are an expert answering user questions."},
10 {"role": "user", "content": "Explain what a black hole is."}
11 ],
12 stop=["###", "\nUser:", "\nQ:"] # Multiple stop sequences
13)
14
15# Receive assistant's reply as output
16print(response.choices[0].message.content)