Sarvam Parse

Given a PDF, this API helps to get structured extraction of data in the document.The API returns a base64 encoded XML string containing the extracted data.

Request

This endpoint expects a multipart form containing a file.

pdffileRequired

Upload the PDF file you want to parse. This should be uploaded as a form input if you’re using multipart/form-data Note: Sarvam Parse supports only English PDFs currently.

page_numberstringOptionalDefaults to 1

The page number you want to extract data from. This is a one-based index (meaning, the first page is 1).

sarvam_modeenumOptional

The mode of parsing to use:

small: Use this mode for economical and fast parsing
large: Use this mode for highest precision parsing

Allowed values:

prompt_cachingenumOptional

Whether to cache the prompt for the parse request. This is useful when running multiple requests to the parsing endpoint.

Allowed values:

Response

Successful Response

outputstringOptional

The base64 encoded HTML string corresponding to the parsed page. The output will be an empty string if parsing fails for some reason.

1	curl -X POST https://api.sarvam.ai/parse/parsepdf \
2	-H "api-subscription-key: <apiSubscriptionKey>" \
3	-H "Content-Type: multipart/form-data" \
4	-F pdf=@<file1>

Headers

Request

Response

Errors