- Python
- TypeScript
The Python SDK supports Python 3.9+ and offers both synchronous and asynchronous clients.
- Data Ingestion (Sources): Ingest, poll status, list, get elements, and manage documents by file_id
- Document Chat: Ask questions about your documents with conversational memory
- Data Extraction: Extract structured data using JSON Schema
- Prebuilt RAG: Retrieve relevant document chunks for custom RAG pipelines
Python SDK Repository
View the Python SDK source code, report issues, and contribute.
TypeScript SDK Repository
View the TypeScript SDK source code, report issues, and contribute.
Installation
- Python
- TypeScript
Install the Graphor SDK from PyPI:
Python 3.9 or higher is required.
Data Ingestion (Sources)
The Sources methods cover the full ingestion lifecycle:Ingest Source
Ingest documents from files, URLs, GitHub, and YouTube (returns build_id; poll for file_id)
Reprocess Source
Reprocess an existing source with a different partition method
List Sources
Retrieve all sources with status and metadata
List Source Elements
Retrieve structured elements/partitions from processed sources
Delete Source
Permanently remove sources from your project
Document Chat
Once your data is ingested, use the Chat method to ask questions:Chat with Documents
Ask natural language questions about your documents with conversational memory and structured outputs
Data Extraction
Extract specific structured data from your documents using schemas:Extract Structured Data
Extract structured information from documents using JSON Schema and natural language instructions
Prebuilt RAG
Build custom RAG pipelines with semantic document retrieval:Retrieve Document Chunks
Retrieve relevant document chunks using semantic search for custom LLM integration
What “Data Ingestion” includes
- Ingest: Create a new source (file, URL, GitHub, YouTube); returns
build_id; poll get build status until ready, then usefile_id - Reprocess: Reprocess an existing source with a different partition method (optional)
- List: Monitor status and metadata; optionally filter by
file_ids - Get elements: Retrieve structured elements/partitions by
file_idafter processing - Delete: Remove a source by
file_id
Authentication
All SDK methods require authentication using API tokens. You can provide your API key in two ways:Environment Variable (Recommended)
Set theGRAPHOR_API_KEY environment variable:
- Python
- TypeScript
Direct Initialization
- Python
- TypeScript
Learn how to generate and manage API tokens in the API Tokens guide.
Token Security
- Never expose tokens in client-side code or public repositories
- Use environment variables to store tokens securely
- Rotate tokens regularly for enhanced security
- Use different tokens for different environments (dev/staging/prod)
Async Usage
- Python
- TypeScript
Simply import
AsyncGraphor instead of Graphor and use await with each API call:Available Methods
Sources
- Python
- TypeScript
| Method | Description |
|---|---|
client.sources.ingest_file() | Ingest a local file (returns build_id) |
client.sources.ingest_url() | Ingest from a web URL |
client.sources.ingest_github() | Ingest from GitHub |
client.sources.ingest_youtube() | Ingest from YouTube |
client.sources.get_build_status() | Poll build status; returns file_id when ready |
client.sources.reprocess() | Reprocess a source by file_id (returns build_id) |
client.sources.list() | List all sources (optional file_ids filter) |
client.sources.get_elements() | Get parsed elements by file_id |
client.sources.delete() | Delete a source by file_id |
Chat & Extraction
- Python
- TypeScript
| Method | Description |
|---|---|
client.sources.ask() | Ask questions about your documents |
client.sources.extract() | Extract structured data using JSON Schema |
client.sources.retrieve_chunks() | Retrieve relevant chunks for custom RAG |
Complete Workflow Example
Here’s the full “happy path”: ingest → get_build_status (poll) → list → get_elements → chat/extract/rag; optionally reprocess byfile_id.
1. Ingest a source
- Python
- TypeScript
2. Reprocess (optional)
- Python
- TypeScript
3. List sources
- Python
- TypeScript
4. Get elements (after processing)
- Python
- TypeScript
5. Ask questions (Chat)
- Python
- TypeScript
6. Extract data
- Python
- TypeScript
7. Retrieve chunks (Prebuilt RAG)
- Python
- TypeScript
Integration Patterns
Complete SDK Client Wrapper
- Python
- TypeScript
Async Integration
- Python
- TypeScript
Error Handling
The SDK provides typed exceptions for different error scenarios:- Python
- TypeScript
Error Types
| Status Code | Error Type | Description |
|---|---|---|
| 400 | BadRequestError | Invalid parameters or malformed request |
| 401 | AuthenticationError | Invalid or missing API key |
| 403 | PermissionDeniedError | Access denied to resource |
| 404 | NotFoundError | Resource doesn’t exist |
| 422 | UnprocessableEntityError | Validation error |
| 429 | RateLimitError | Too many requests |
| ≥500 | InternalServerError | Server-side error |
| N/A | APIConnectionError | Network connectivity issues |
| N/A | APITimeoutError | Request timed out |
Configuration
Retries
Certain errors are automatically retried 2 times by default with exponential backoff:- Python
- TypeScript
Timeouts
- Python
- TypeScript
By default, requests time out after 1 minute:
Using aiohttp for Better Concurrency (Python only)
For high-concurrency async operations in Python, use the aiohttp client:Rate Limits and Best Practices
Performance Guidelines
- Batch Operations: Process multiple files sequentially or with controlled concurrency
- Async Processing: Use
AsyncGraphor(Python) orPromise.all(TypeScript) for concurrent operations - Retry Logic: The SDK handles retries automatically; configure
max_retries/maxRetriesas needed - Timeout Handling: Increase timeouts for large documents or complex processing
Best Practices
- Python
- TypeScript
Common Use Cases
Document Processing Pipeline
- Python
- TypeScript
Q&A System
- Python
- TypeScript
Custom RAG with Your LLM
- Python
- TypeScript
Support and Resources
Getting Help
Contact Support
Direct support for technical questions and issues
API Tokens Guide
Learn how to generate and manage authentication tokens
Data Ingestion Guide
Best practices for document upload and processing
REST API Reference
Full REST API documentation for advanced use cases
Next Steps
Ready to start building with the Graphor SDK? Choose your path:For Beginners
Ingest Sources
Ingest documents from files, URLs, GitHub, and YouTube; poll for file_id
Chat with Documents
Ask natural language questions about your documents
API Tokens
Set up authentication for API access
For Advanced Users
Data Extraction
Extract structured data using JSON Schema
Prebuilt RAG
Build custom RAG pipelines with semantic search
Reprocess Source
Reprocess sources with different partition methods
List Elements
Access structured document elements and metadata

