get_elements method (same name as the API endpoint) returns the parsed elements of a source. Each item is a BuildStatusElement with element_id, element_type, text, markdown, html, optional img_base64, position, page_number, bounding_box, page_layout, and more. Use file_id (from list or get build status).
Method overview
- Python
- TypeScript
client.sources.get_elements()Method signature
- Python
- TypeScript
Parameters
- Python
- TypeScript
| Parameter | Type | Description | Required |
|---|---|---|---|
file_id | str | Unique identifier of the source | Yes |
page | int | None | 1-based page number (use with page_size) | No |
page_size | int | None | Elements per page (1–100) | No |
suppress_img_base64 | bool | When true, omit img_base64 from each element | No |
type | str | None | Filter by element type (e.g. Title, NarrativeText, Table) | No |
page_numbers | list[int] | None | Restrict to specific page numbers | No |
elements_to_remove | list[str] | None | Element types to exclude | No |
timeout | float | Request timeout in seconds | No |
Filter parameters
All filter parameters are passed at the top level (not as a nested object).| Parameter | Python | TypeScript | Description |
|---|---|---|---|
| Element type filter | type | type | Filter by element type (e.g. Title, NarrativeText, Table) |
| Page number filter | page_numbers | page_numbers | Restrict to specific page numbers |
| Exclude types | elements_to_remove | elementsToRemove | Element types to exclude |
Response
Paginated response with BuildStatusElement items (same shape as elements in Get build status):| Field | Type | Description |
|---|---|---|
items | list | Elements in the current page (or all if no pagination) |
total | int | Total elements matching filters |
page | int | null | Current page (1-based) or null |
page_size | int | null | Elements per page or null |
total_pages | int | null | Total pages or null |
BuildStatusElement (each item)
| Field | Type | Description |
|---|---|---|
element_id | str | null | Unique identifier for the element |
element_type | str | null | e.g. Title, NarrativeText, Table, Image |
text | str | Plain text content |
markdown | str | null | Markdown when available |
html | str | null | HTML when available |
img_base64 | str | null | Base64 image (omitted if suppress_img_base64=true) |
position | int | null | Order within the document |
page_number | int | null | Page number (1-based) |
bounding_box | object | null | Bounding box (left, top, width, height) |
page_layout | object | null | Page dimensions |
page_annotation | str | null | Page-level annotation |
page_keywords | array | null | Keywords for the page |
page_topics | array | null | Topics for the page |
metadata | object | Additional metadata |
Element Types
| Type | Description |
|---|---|
Title | Document and section titles |
NarrativeText | Main body paragraphs and content |
ListItem | Items in bullet points or numbered lists |
Table | Complete data tables |
TableRow | Individual rows within tables |
Image | Picture or graphic elements |
Header | Header content at top of pages |
Footer | Footer content at bottom of pages |
Formula | Mathematical formulas and equations |
CompositeElement | Elements containing multiple types |
FigureCaption | Text describing images or figures |
PageBreak | Indicators of page separation |
Address | Physical address information |
EmailAddress | Email contact information |
PageNumber | Page numbering elements |
CodeSnippet | Programming code segments |
FormKeysValues | Key-value pairs in forms |
Link | Hyperlinks and references |
UncategorizedText | Text that doesn’t fit other categories |
Code Examples
Basic usage
- Python
- TypeScript
Filter by element type
- Python
- TypeScript
Filter by page numbers
- Python
- TypeScript
Exclude element types
- Python
- TypeScript
Combine filters
- Python
- TypeScript
Async usage
- Python
- TypeScript
Paginate through all elements
- Python
- TypeScript
Error handling
- Python
- TypeScript
Advanced Examples
Document Structure Analyzer
Analyze the structure of a document:- Python
- TypeScript
Extract Tables
Extract all tables from a document:- Python
- TypeScript
Build Document Outline
Create a document outline from titles:- Python
- TypeScript
Search Content in Elements
Search for specific content within document elements:- Python
- TypeScript
Async Batch Processing
Process multiple documents concurrently:- Python
- TypeScript
Document Comparator
Compare element structure between documents:- Python
- TypeScript
Error Reference
| Error Type | Status Code | Description |
|---|---|---|
BadRequestError | 400 | Invalid request payload or parameters |
AuthenticationError | 401 | Invalid or missing API key |
NotFoundError | 404 | Source not found for the given file_id |
RateLimitError | 429 | Too many requests, please retry after waiting |
InternalServerError | ≥500 | Server-side error processing request |
APIConnectionError | N/A | Network connectivity issues |
APITimeoutError | N/A | Request timed out |
Best Practices
Performance Optimization
- Use appropriate page sizes: Start with 20-50 elements per page for optimal performance
- Filter server-side: Use filter parameters to reduce data transfer
- Cache results: Store element data locally for repeated access
- Python
- TypeScript
Data Processing
- Element type awareness: Different element types need different processing
- Use HTML field: The
text_as_htmlfield preserves formatting - Handle None metadata: Always check if metadata exists before accessing
- Python
- TypeScript
Memory Management
- Stream large documents: Process in chunks rather than loading all at once
- Clear processed data: Remove unnecessary fields when not needed
- Python
- TypeScript
Troubleshooting
Slow response times
Slow response times
Causes: Large page sizes, complex filters, or server loadSolutions:
- Reduce
page_sizeto 25-50 elements - Use specific filters to reduce result set
- Implement request timeouts
- Python
- TypeScript
Empty results
Empty results
Causes: File not processed, incorrect file name, or overly restrictive filtersSolutions:
- Verify source is processed (status
Completed) withclient.sources.list() - Use
file_idfrom list or get build status - Remove or relax filter criteria
Missing expected elements
Missing expected elements
Causes: Processing method limitations, file format issues, or filter conflictsSolutions:
- Try a different partition method using
client.sources.reprocess() - Check if elements are categorized under different types
- Remove
elements_to_removefilter temporarily
Memory issues with large documents
Memory issues with large documents
Causes: Processing too many elements at onceSolutions:
- Reduce
page_sizeand process incrementally - Filter out unnecessary element types
- Use streaming processing patterns
Next steps
After retrieving elements:Get build status
Poll build status and get elements for a build
List sources
List all sources and their file_ids
Upload
Ingest files, URLs, GitHub, or YouTube
Reprocess source
Re-process a source with a different partition method
Delete source
Remove a source by file_id

