Endpoints
Upload a File
POST
https://sources.graphorlm.com/uploadUpload local files (PDF, DOCX, images, audio, video, etc.)Upload from URL
POST
https://sources.graphorlm.com/upload-url-sourceScrape and ingest a public web page by URLUpload from GitHub
POST
https://sources.graphorlm.com/upload-github-sourceIngest content from a public GitHub repositoryUpload from YouTube
POST
https://sources.graphorlm.com/upload-youtube-sourceIngest content from a public YouTube video URLAuthentication
All endpoints on this page require authentication using an API token. You must include your API token as a Bearer token in the Authorization header.Learn how to create and manage API tokens in the API Tokens guide.
Upload a File
Endpoint Overview
HTTP Method
POST
Endpoint URL
Authentication
Uses the same API token authentication described in Authentication.Request Format
Headers
| Header | Value | Required |
|---|---|---|
Authorization | Bearer YOUR_API_TOKEN | ✅ Yes |
Content-Type | multipart/form-data | ✅ Yes |
Request Body
The request must be sent asmultipart/form-data with the following fields:
| Field | Type | Description | Required |
|---|---|---|---|
file | File | The document file to upload | ✅ Yes |
partition_method | string | Processing method to use (see Partition Methods below) | No |
Partition Methods
When provided, thepartition_method parameter allows you to process/parse the document immediately during upload. If not provided, the system uses the default method.
| Value | Name | Description |
|---|---|---|
basic | Fast | Fast processing with heuristic classification. No OCR. |
hi_res | Balanced | OCR-based extraction with AI-powered structure classification. |
hi_res_ft | Accurate | Fine-tuned AI model for highest accuracy (Premium). |
mai | VLM | Best text-first parsing for manuscripts and handwritten documents. |
graphorlm | Agentic | Highest parsing setting for complex layouts, multi-page tables, and diagrams. |
For more details about processing methods, see the Process Source documentation.
File Requirements
Supported File Types
Supported File Types
Graphor supports a wide range of document formats:Documents: PDF, DOC, DOCX, TXT, TEXT, MD, HTML, HTMPresentations: PPT, PPTXSpreadsheets: CSV, TSV, XLS, XLSXImages: PNG, JPG, JPEG, TIFF, BMP, HEICAudio: MP3, WAV, M4A, OGG, FLACVideo: MP4, MOV, AVI, MKV, WEBM
File Size Limits
File Size Limits
Maximum file size: 100MB per fileFor larger files, consider:
- Compressing the file if possible
- Splitting large documents into smaller sections
- Using file optimization tools before upload
File Name Requirements
File Name Requirements
- File must have a valid filename with extension
- Extension determines the processing method
- File names should be descriptive for easy identification
Response Format
Success Response (200 OK)
Response Fields
| Field | Type | Description |
|---|---|---|
status | string | Processing status (New, Processing, Completed, Failed) |
message | string | Human-readable success message |
file_id | string | Unique identifier for the source (use this for subsequent API calls) |
file_name | string | Name of the uploaded file |
file_size | integer | Size of the file in bytes |
file_type | string | File extension/type |
file_source | string | Source type (always “local file” for uploads) |
project_id | string | UUID of the target project |
project_name | string | Name of the target project |
partition_method | string | Document processing method used |
Code Examples
JavaScript/Node.js
Python
cURL
cURL with Partition Method
Error Responses
Common Error Codes
| Status Code | Error Type | Description |
|---|---|---|
400 | Bad Request | Invalid file type, missing filename, or malformed request |
401 | Unauthorized | Invalid or missing API token |
403 | Forbidden | Access denied to the specified project |
404 | Not Found | Project not found |
413 | Payload Too Large | File exceeds 100MB limit |
500 | Internal Server Error | Server-side processing error |
Error Response Format
Error Examples
Unsupported File Type (400)
Unsupported File Type (400)
Invalid API Token (401)
Invalid API Token (401)
File Too Large (413)
File Too Large (413)
Document Processing
After a successful upload, Graphor automatically processes your document:Processing Stages
- Upload Complete - File is securely stored in your project
- Text Extraction - Content is extracted using advanced OCR and parsing
- Structure Recognition - Document elements are identified and classified
- Ready for Use - Document is available for chunking and retrieval
Processing Methods
The system automatically selects the optimal processing method based on file type:| File Type | Default Method | Description |
|---|---|---|
| PDF, Documents | Basic | Fast processing with heuristic classification |
| Images | OCR | Optical character recognition for text extraction |
| Text files | Basic | Direct text processing |
| Spreadsheets | Basic | Table structure preservation |
You can reprocess documents with different methods using the Process Source endpoint after upload.
Best Practices
File Preparation
- Optimize file size: Compress large files when possible while maintaining quality
- Use descriptive names: Include relevant keywords in filenames for easy identification
- Check file integrity: Ensure files are not corrupted before upload
Error Handling
- Implement retry logic: Handle temporary network issues with exponential backoff
- Validate before upload: Check file types and sizes client-side before making requests
- Monitor upload status: Use the response to track processing progress
Security
- Protect API tokens: Never expose tokens in client-side code or public repositories
- Use HTTPS only: All API requests are automatically secured with TLS encryption
- Rotate tokens regularly: Update API tokens periodically for enhanced security
Integration Examples
Batch Upload Script
Upload with Progress Tracking
Troubleshooting
Upload timeouts
Upload timeouts
Causes: Large files, slow connection, or server loadSolutions:
- Increase request timeout (recommend 5+ minutes for large files)
- Retry failed uploads with exponential backoff
- Consider compressing large files before upload
File processing failures
File processing failures
Causes: Corrupted files, unsupported formats, or complex layoutsSolutions:
- Verify file integrity before upload
- Try converting to a more standard format
- Use the Process Source endpoint with different methods
Authentication issues
Authentication issues
Causes: Invalid tokens, expired tokens, or incorrect headersSolutions:
- Verify token format starts with “grlm_”
- Check token hasn’t been revoked in the dashboard
- Ensure correct Authorization header format
Network connectivity
Network connectivity
Causes: DNS issues, firewall restrictions, or network timeoutsSolutions:
- Test connectivity to sources.graphorlm.com
- Check firewall allows outbound HTTPS traffic
- Use appropriate timeout values for your network
Upload from URL
Use this endpoint to ingest content by scraping a public web page. It fetches the page, extracts text, and creates a source in your project for downstream processing.Endpoint Overview
HTTP Method
POST
Endpoint URL
Request Format
Headers
| Header | Value | Required |
|---|---|---|
Authorization | Bearer YOUR_API_TOKEN | ✅ Yes |
Content-Type | application/json | ✅ Yes |
Request Body
Send a JSON body with the following fields:| Field | Type | Description | Required |
|---|---|---|---|
url | string | The URL of the web page to scrape | ✅ Yes |
crawlUrls | boolean | Whether to crawl and ingest links from the given URL | No (default: false) |
partition_method | string | Processing method to use (see Partition Methods above) | No |
URL Requirements
Supported URL types
Supported URL types
- Public web pages
- Pages that render primary content server-side and are reachable without interaction
Access requirements
Access requirements
- The URL must be publicly reachable over HTTPS
- Authentication-protected pages are not supported by this endpoint
This endpoint scrapes web pages. To ingest files (PDF, DOCX, etc.), use Upload a File.
Response Format
Success Response (200 OK)
Response Fields
| Field | Type | Description |
|---|---|---|
status | string | Processing status (New, Processing, Completed, Failed, etc.) |
message | string | Human-readable status message |
file_id | string | Unique identifier for the source (use this for subsequent API calls) |
file_name | string | Name or URL for the ingested source |
file_size | integer | Size in bytes (0 for URL-based initial record) |
file_type | string | Detected type when applicable |
file_source | string | Source type (url) |
project_id | string | UUID of the project |
project_name | string | Name of the project |
partition_method | string | Document processing method used |
Code Examples
JavaScript/Node.js
Python
cURL
Error Responses
Common Error Codes
| Status Code | Error Type | Description |
|---|---|---|
400 | Bad Request | Invalid or missing URL, malformed JSON |
401 | Unauthorized | Invalid or missing API token |
403 | Forbidden | Access denied to the specified project |
404 | Not Found | Project or source not found |
500 | Internal Server Error | Error during URL processing |
Error Response Format
Error Examples
Invalid URL (400)
Invalid URL (400)
Invalid API Token (401)
Invalid API Token (401)
Document Processing
After a successful request, Graphor begins fetching and scraping the web page in the background.Processing Stages
- URL Accepted - The request is validated and scheduled
- Content Retrieval - The page is fetched over HTTPS
- Text Extraction - Visible text is extracted and normalized
- Structure Recognition - Document elements are identified and classified
- Ready for Use - Document is available for chunking and retrieval
Processing Methods
The system selects the optimal processing method based on the detected content. You can reprocess with a different method after ingestion.You can reprocess sources using the Process Source endpoint after ingestion.
Best Practices
- Provide reachable URLs: Ensure the page is publicly accessible over HTTPS
- Disable crawling when unneeded: Set
crawlUrlstofalseto ingest only the provided URL - Respect site policies: Only scrape pages you are permitted to and consider website rate limits
- Retry logic: Implement retries for transient network issues
Upload from GitHub
Use this endpoint to ingest content directly from a public GitHub repository into your Graphor project.Endpoint Overview
HTTP Method
POST
Request Format
Headers
| Header | Value | Required |
|---|---|---|
Authorization | Bearer YOUR_API_TOKEN | ✅ Yes |
Content-Type | application/json | ✅ Yes |
Request Body
Send a JSON body with the following field:| Field | Type | Description | Required |
|---|---|---|---|
url | string | The GitHub repository URL to ingest (e.g., https://github.com/org/repo) | ✅ Yes |
Repository Requirements
Supported URLs
Supported URLs
- Public GitHub repositories
- HTTPS URLs (
https://github.com/...)
Access requirements
Access requirements
- Only public repositories are supported via this endpoint
- Private repository ingestion is not supported
Response Format
Success Response (200 OK)
Response Fields
| Field | Type | Description |
|---|---|---|
status | string | Processing status (New, Processing, Completed, Failed, etc.) |
message | string | Human-readable status message |
file_id | string | Unique identifier for the source (use this for subsequent API calls) |
file_name | string | The repository URL |
file_size | integer | Size in bytes (0 for initial GitHub record) |
file_type | string | Detected file type (when applicable) |
file_source | string | Source type (github) |
project_id | string | UUID of the project |
project_name | string | Name of the project |
partition_method | string | Document processing method used |
Code Examples
JavaScript/Node.js
Python
cURL
Error Responses
Common Error Codes
| Status Code | Error Type | Description |
|---|---|---|
400 | Bad Request | Invalid or missing URL, malformed JSON |
401 | Unauthorized | Invalid or missing API token |
403 | Forbidden | Access denied to the specified project |
404 | Not Found | Project or source not found |
500 | Internal Server Error | Error during repository processing |
Error Response Format
Error Examples
Invalid URL (400)
Invalid URL (400)
Invalid API Token (401)
Invalid API Token (401)
Document Processing
After a successful request, Graphor begins processing the GitHub source in the background.Processing Stages
- Request Accepted - The request is validated and scheduled
- Repository Fetch - Repository content is retrieved
- Text Extraction - Content is extracted and normalized
- Structure Recognition - Document elements are identified and classified
- Ready for Use - Content is available for chunking and retrieval
You can reprocess sources using the Process Source endpoint after ingestion.
Best Practices
- Provide valid repository URLs: Use the canonical HTTPS GitHub URL
- Public repositories only: Private repositories are not supported
- Retry logic: Implement retries for transient network issues
Upload from YouTube
Use this endpoint to ingest content from a public YouTube video URL into your Graphor project.Endpoint Overview
HTTP Method
POST
Request Format
Headers
| Header | Value | Required |
|---|---|---|
Authorization | Bearer YOUR_API_TOKEN | ✅ Yes |
Content-Type | application/json | ✅ Yes |
Request Body
Send a JSON body with the following field:| Field | Type | Description | Required |
|---|---|---|---|
url | string | The YouTube video URL to ingest (e.g., https://www.youtube.com/watch?v=...) | ✅ Yes |
Video Requirements
Supported URLs
Supported URLs
- Public YouTube video URLs (HTTPS)
Access requirements
Access requirements
- The video must be publicly accessible
- Private or access-restricted videos are not supported
Response Format
Success Response (200 OK)
Response Fields
| Field | Type | Description |
|---|---|---|
status | string | Processing status (New, Processing, Completed, Failed, etc.) |
message | string | Human-readable status message |
file_id | string | Unique identifier for the source (use this for subsequent API calls) |
file_name | string | The video URL (echoed back) |
file_size | integer | Size in bytes (0 for URL-based initial record) |
file_type | string | Detected type (when applicable) |
file_source | string | Source type (youtube) |
project_id | string | UUID of the project |
project_name | string | Name of the project |
partition_method | string | Document processing method used |
Code Examples
JavaScript/Node.js
Python
cURL
Error Responses
Common Error Codes
| Status Code | Error Type | Description |
|---|---|---|
400 | Bad Request | Invalid or missing URL, malformed JSON |
401 | Unauthorized | Invalid or missing API token |
403 | Forbidden | Access denied to the specified project |
404 | Not Found | Project or source not found |
500 | Internal Server Error | Error during video processing |
Error Response Format
Document Processing
After a successful request, Graphor begins processing the YouTube source.Processing Stages
- Request Accepted - The request is validated and scheduled
- Content Retrieval - The video is fetched
- Transcription / Text Extraction - Audio is transcribed and normalized
- Structure Recognition - Content is segmented and classified
- Ready for Use - Content is available for chunking and retrieval
You can reprocess sources using the Process Source endpoint after ingestion.
Best Practices
- Prefer clear audio: Better audio quality improves transcription accuracy
- Keep URLs stable: Use the canonical YouTube URL format when possible
- Retry logic: Implement retries for transient network issues
Next Steps
After successfully uploading your documents:Parse Source
Reprocess documents with different parsing methods for optimal results
List Sources
Retrieve information about all uploaded documents in your project
List Parse Results
Retrieve structured elements and partitions from processed documents
Delete Source
Remove documents that are no longer needed from your project

