ask method allows you to ask questions about your ingested documents and receive answers grounded in your content. The SDK supports conversational memory, enabling follow-up questions that maintain context.
Method Overview
- Python
- TypeScript
Sync Method
client.sources.ask()Async Method
await client.sources.ask() (using AsyncGraphor)Method Signature
- Python
- TypeScript
Parameters
- Python
- TypeScript
| Parameter | Type | Description | Required |
|---|---|---|---|
question | str | The question to ask about your documents | Yes |
conversation_id | str | Conversation identifier to maintain memory context across questions | No |
reset | bool | When True, starts a new conversation and ignores previous history | No |
file_ids | list[str] | Restrict search to specific documents by file ID (preferred) | No |
file_names | list[str] | Restrict search to specific documents by file name (deprecated, use file_ids) | No |
output_schema | dict | JSON Schema to request structured output (see below) | No |
thinking_level | str | Controls model and thinking configuration: "fast", "balanced", "accurate" (default) | No |
include_citation_images | bool | When True, populates image_base64 (base64 PNG of the cited page) inside each citations entry. Default False. See When to use include_citation_images. | No |
include_citation_markup | bool | When True, the answer field keeps the raw structured citation markup [N](file_id|pX|sY|eZ|fNAME) instead of stripping it down to plain [N] markers. Default False. The markup is an implementation detail — prefer parsing citations. Has no effect when output_schema is set. | No |
timeout | float | Request timeout in seconds | No |
Thinking Level
Thethinking_level parameter controls the model and thinking configuration used for answering questions:
| Value | Description |
|---|---|
"fast" | Uses a faster model without extended thinking. Best for simple questions where speed is prioritized. |
"balanced" | Uses a more capable model with low thinking. Good balance between quality and speed. |
"accurate" | Default. Uses a more capable model with high thinking. Best for complex questions requiring deep reasoning. |
Response Object
The method returns aSourceAskResponse object:
| Property | Type | Description |
|---|---|---|
answer | str | The answer to your question, with inline [N] markers pointing to entries in citations. When output_schema is provided, this will be a short status message. |
conversation_id | str | None | Conversation identifier for follow-up questions |
structured_output | dict | None | Structured output validated against the requested output_schema. Present only when output_schema is provided. |
raw_json | str | None | Raw JSON-text produced by the model before validation. Present only when output_schema is provided. |
citations | list[Citation] | None | Structured references resolving each [N] marker in answer. May be empty/None when the agent did not ground its answer (e.g. small-talk follow-ups). See Citations. |
usage | Usage | None | Token usage breakdown for the request |
elapsed_s | float | None | Wall-clock time in seconds |
Citations
Each entry incitations corresponds to one [N] marker that appears in the answer text. Use index to map a marker to its citation.
| Field | Type | Description |
|---|---|---|
index | int | The 1-based citation number that appears as [N] in answer |
file_id | str | Unique identifier of the source file |
file_name | str | Display name of the source file |
page_number | int | 1-based page number where the cited content appears |
section_number | int | None | Optional section number within the page |
element_id | str | None | Optional element identifier (e.g. specific paragraph or table) |
text_preview | str | None | Short text excerpt around the cited content |
image_base64 | str | None | Base64-encoded PNG of the cited page. Populated only when the call used include_citation_images=True. May be None if the source is not visualizable (e.g. plain text). |
When to use include_citation_images
The include_citation_images flag is convenient for quick prototyping or one-off requests where you want both the answer and the visual previews in a single round-trip.
For real applications — especially when answers commonly cite many pages — prefer leaving the flag as False (the default) and lazy-load the screenshots on demand via the dedicated get_page_screenshot method. Reasons:
- Payload size: each base64 PNG is typically 100-400 KB. Five citations can push the JSON response above a megabyte.
- Latency: rendering screenshots adds seconds to the response. Without the flag, the answer comes back as soon as the model finishes.
- Cache locality: the screenshot endpoint is keyed by
(file_id, page_number)and emits aCache-Controlhint. Lazy-loading lets browsers and CDNs cache the bytes; inlining base64 prevents that.
include_citation_images=True only when you control both ends and know the answer will cite at most 1-2 pages. Otherwise, ship the answer with structured citations and fetch images on hover/click.
Code Examples
Basic Question
- Python
- TypeScript
Conversation with Memory
Useconversation_id to maintain context across multiple questions:
- Python
- TypeScript
Reset Conversation
Start fresh by using thereset parameter:
- Python
- TypeScript
Filter by Specific Documents
Restrict the search to specific files usingfile_ids (preferred):
- Python
- TypeScript
Working with Citations
Every grounded answer comes back with a structuredcitations array. The default flow is fetch screenshots on demand — only render images for the citations the user actually inspects.
- Python
- TypeScript
Inlining citation images in the response
When you really do want the screenshots in the same round-trip — for example, a one-off batch job that won’t be re-rendered — setinclude_citation_images=true. Avoid this in interactive UIs and any flow where the answer typically cites many pages.
- Python
- TypeScript
Using Thinking Level
Control the model’s reasoning depth withthinking_level:
- Python
- TypeScript
Structured Output with JSON Schema
Request structured data by providing anoutput_schema:
- Python
- TypeScript
Extract Array of Items
Extract multiple items with a schema:- Python
- TypeScript
Async Usage
- Python
- TypeScript
Error Handling
- Python
- TypeScript
Advanced Examples
Chatbot Class
Build a conversational chatbot:- Python
- TypeScript
Multi-Document Q&A
Ask questions across multiple documents:- Python
- TypeScript
Structured Data Extraction Pipeline
Extract structured data from multiple documents:- Python
- TypeScript
Parallel Questions
Ask multiple questions in parallel:- Python
- TypeScript
Interactive Q&A Session
Build an interactive command-line Q&A:- Python
- TypeScript
Output Schema Guidelines
When usingoutput_schema, follow these guidelines:
Supported Schema Features
- Basic types:
string,number,integer,boolean,null - Objects with
properties - Arrays with
items - Union with
nullonly:["string", "null"]
Unsupported Features
oneOf,anyOf,allOf$refreferences- Complex unions beyond
null
Schema Examples
- Python
- TypeScript
Error Reference
| Error Type | Status Code | Description |
|---|---|---|
BadRequestError | 400 | Invalid parameters or request format |
AuthenticationError | 401 | Invalid or missing API key |
NotFoundError | 404 | Specified file not found |
UnprocessableEntityError | 422 | Invalid output_schema or structured output validation failed |
RateLimitError | 429 | Too many requests, please retry after waiting |
InternalServerError | ≥500 | Server-side error |
APIConnectionError | N/A | Network connectivity issues |
APITimeoutError | N/A | Request timed out |
Best Practices
-
Use conversation memory — Pass
conversation_idfor follow-up questions to maintain context - Be specific — Clear, specific questions get better answers
-
Scope when needed — Use
file_idsorfile_namesto focus on specific documents for faster, more accurate responses -
Use structured output for integration — Provide
output_schemato get JSON you can reliably parse in code -
Reset when changing topics — Set
reset=Truewhen switching to unrelated questions -
Lazy-load citation images — Keep
include_citation_images=False(the default) and callget_page_screenshoton demand. Inlining base64 only makes sense for low-citation, low-frequency requests — for typical chat UIs it bloats the payload by hundreds of KB per cited page. -
Parse
citations, not the markup — Use the structuredcitationsarray. The inline[N](file_id|pX|...)markup is hidden by default and is an implementation detail that may change. - Handle errors gracefully — Implement proper error handling for production applications
- Python
- TypeScript
Next Steps
Get Page Screenshot
Lazy-load citation page previews on demand
Document Chat Guide
Learn best practices for chatting with your documents
Extract API
Extract structured data from documents
Upload Sources
Upload documents to chat with
Prebuilt RAG
Retrieve relevant chunks from your documents

