- Data Ingestion (Sources API): Ingest data (async), poll build status, then list sources and retrieve elements.
- Document Chat (Chat API): Ask questions about your ingested documents using natural language.
- Data Extraction (Extract API): Extract specific structured data from documents using custom schemas.
Data Ingestion (Sources API)
The Sources API covers the full ingestion lifecycle. Ingestion is asynchronous: ingest endpoints return abuild_id; you poll Get build status until the job completes, then use the returned file_id for list, elements, chat, and extraction.
Upload & ingest
Ingest files, URLs, GitHub, or YouTube (async). Poll build status for completion.
Get build status
Poll status and optional parsed elements for an async ingestion or reprocess
Reprocess source
Re-process an existing source with a different partition method (async)
List sources
List all sources with status and metadata (optional filter by file_id)
Get elements
Retrieve parsed elements (chunks) of a source (same format as build status elements)
Delete source
Permanently remove a source from your project
Document Chat (Chat API)
Once your data is ingested, use the Chat API to ask questions:Chat with Documents
Ask natural language questions about your documents with conversational memory and structured outputs
Data Extraction (Extract API)
Extract specific structured data from your documents using schemas:Extract Structured Data
Extract structured information from documents using custom schemas and natural language instructions
What “Data Ingestion” includes
- Ingest: create a new source via ingest-file, ingest-url, ingest-github, or ingest-youtube (async; returns
build_id) - Build status: poll
GET /builds/{build_id}until the job completes; then usefile_idfor other calls - Reprocess: re-run the pipeline on an existing source with a different partition method (async; returns
build_id) - List: list all sources (optionally filter by
file_ids) and monitor status - Get elements: retrieve parsed elements (chunks) for a source by
file_id(GET with query params) - Delete: remove a source by
file_id(required)
Authentication
All API endpoints require authentication using API tokens. Include your token in the Authorization header:Learn how to generate and manage API tokens in the API Tokens guide.
Token Security
- Never expose tokens in client-side code or public repositories
- Use environment variables to store tokens securely
- Rotate tokens regularly for enhanced security
- Use different tokens for different environments (dev/staging/prod)
URL Structure
All endpoints use the base https://sources.graphorlm.com.Sources (Data Ingestion)
GET /builds/{build_id}— Poll build status (and optional elements)POST /ingest-file— Upload a file (async)POST /ingest-url— Ingest a web page (async)POST /ingest-github— Ingest a GitHub repo (async)POST /ingest-youtube— Ingest a YouTube video (async)POST /reprocess— Re-process an existing source (async)GET /— List all sources (optional?file_ids=...)GET /get-elements— Get parsed elements of a source (file_idrequired)DELETE /delete— Delete a source (JSON body:file_idrequired)
Chat & Extraction
POST /ask-sources— Ask questions about documents (optionalfile_ids/file_names)POST /run-extraction— Extract structured data (file_ids,user_instruction,output_schema)POST /prebuilt-rag— Retrieve chunks (RAG) without generating an answer
Response Formats
All API responses follow consistent JSON structures with appropriate HTTP status codes:Success Response Pattern
Many endpoints return resource-specific JSON. Async ingestion endpoints (ingest-file, ingest-url, ingest-github, ingest-youtube, reprocess) return immediately with:build_id with GET /builds/ to poll until the job completes. Other endpoints return their own shapes (e.g. paginated items + total, or answer + conversation_id).
Error Response Pattern
Common Status Codes
| Code | Meaning | Usage |
|---|---|---|
| 200 | OK | Successful GET, POST, PATCH operations |
| 400 | Bad Request | Invalid parameters or malformed requests |
| 401 | Unauthorized | Invalid or missing API token |
| 404 | Not Found | Resource doesn’t exist |
| 413 | Payload Too Large | File size exceeds limits |
| 500 | Internal Server Error | Server-side processing errors |
Complete Workflow Example
Here’s the full “happy path”: ingest (async) → poll build status → list / get elements → chat / extract. Use thefile_id returned once the build completes for all subsequent calls.
1. Ingest a file (async)
2. Poll build status until complete
3. List sources (optional)
4. Get parsed elements
5. Ask questions (Chat)
6. Extract structured data
Integration Patterns
Minimal Sources (Ingestion) Client
Python Integration
Rate Limits and Best Practices
Performance Guidelines
- Batch Operations: Group multiple related requests when possible
- Asynchronous Processing: Use async/await for multiple concurrent requests
- Retry Logic: Implement exponential backoff for transient failures
- Caching: Cache frequently accessed data like flow configurations
Error Handling Best Practices
Testing and Development
API Testing Tools
You can test Graphor API endpoints using:- cURL: Command-line testing and scripting
- Postman: Interactive API testing and documentation
- Bruno/Insomnia: Alternative API clients
- Custom Scripts: Automated testing suites
Example cURL Commands
Common Use Cases
Content Management System Integration
Ingest documents as they’re created/updated in your CMS:Automated Ingestion Pipeline
Batch ingest research documents (async; each returns a build_id to poll):Migration and Versioning
API Versioning
The Graphor API follows semantic versioning principles:- Current Version: v1 (stable)
- Endpoint Paths: Include version in URL structure where applicable
- Backward Compatibility: Breaking changes will increment major version
Migration Best Practices
- Monitor API Updates: Subscribe to API changelog notifications
- Version Pinning: Specify API versions in your integrations
- Gradual Migration: Test new versions in staging before production deployment
- Fallback Strategies: Implement graceful degradation for API changes
Support and Resources
Getting Help
Contact Support
Direct support for technical questions and issues
API Tokens Guide
Learn how to generate and manage authentication tokens
Data Ingestion
Best practices for document upload and processing
Flows Overview
Master comprehensive RAG pipeline and node management
Community and Updates
- Documentation Updates: This documentation is continuously updated with new features
- API Changelog: Monitor changes and new endpoint releases
- Best Practices: Learn from community implementations and use cases
Next Steps
Ready to start building with Graphor APIs? Choose your path:For Beginners
Upload & ingest
Ingest documents from files, URLs, GitHub, or YouTube (async); poll build status for completion
Chat with Documents
Ask natural language questions about your documents with conversational memory and structured outputs
API Tokens
Set up authentication for API access
For Advanced Users
Flows Overview
Master comprehensive RAG pipeline and node management
LLM Integration
Advanced language model configuration and optimization
Advanced RAG
Explore Smart RAG, Graph RAG, and RAPTOR capabilities
Integration Patterns
Build production-ready integrations

