Endpoint Overview
HTTP Method
POST
Endpoint URL
Authentication
This endpoint requires authentication using an API token. You must include your API token as a Bearer token in the Authorization header.Learn how to create and manage API tokens in the API Tokens guide.
Request Format
Headers
| Header | Value | Required |
|---|---|---|
Authorization | Bearer YOUR_API_TOKEN | ✅ Yes |
Content-Type | application/json | ✅ Yes |
Request Body
The request must be sent as JSON with the following fields:| Field | Type | Description | Required |
|---|---|---|---|
file_name | string | Name of the previously uploaded file to reprocess | ✅ Yes |
partition_method | string | Processing method to use (see available methods below) | ✅ Yes |
Available Processing Methods
Basic
Basic
Best for: Simple text documents, quick processing
- Fast processing with heuristic classification
- No OCR processing
- Suitable for plain text files and well-structured documents
- Recommended for testing and development
OCR
OCR
Best for: Scanned documents, images with text
- Utilizes OCR for text extraction and parsing
- Heuristic-based document element classification
- Ideal for scanned PDFs and image files
- Balances processing speed and accuracy
Balanced
Balanced
Best for: Complex documents with varied layouts
- OCR-based text extraction
- AI-powered document structure classification using Hi-Res model
- Better recognition of tables, figures, and document elements
- Enhanced accuracy for complex layouts
Accurate
Accurate
Best for: Premium accuracy, specialized documents
- OCR-based text extraction
- Fine-tuned AI model for document classification
- Highest accuracy for document structure recognition
- Optimized for specialized and complex document types
- Note: Premium feature
Agentic
Agentic
Best for: Complex layouts, multi-page tables, diagrams, and images
- Our highest parsing setting for complex layouts
- Rich annotations for images and complex elements
- Uses agentic processing for enhanced understanding
- Advanced document understanding capabilities
VLM
VLM
Best for: Text-first parsing, manuscripts, and handwritten documents
- Our best text-first parsing with high-quality output
- Does not output bounding boxes or page layout (no bbox)
- Best for MANUSCRIPT and HANDWRITTEN documents
- Performs page annotation (page-level labels and context)
- Performs document annotation (document-level labels and summaries)
- Performs image annotation when images are present in the document
- Best-in-class text parsing quality; element classification is limited
partition_method values
Use these values for the partition_method field when calling the endpoint:
| Method | partition_method |
|---|---|
| Fast | basic |
| OCR (deprecated) | ocr |
| Balanced | hi_res |
| Accurate | hi_res_ft |
| Agentic | graphorlm |
| VLM | mai |
Processing Method Selection Guide
Method Comparison
| Method | Speed | Text Parsing | Element Classification | Bounding Boxes | Best Use Cases | OCR |
|---|---|---|---|---|---|---|
| Fast | ⚡⚡⚡ | ⭐⭐ | ⭐⭐ | ✅ (limited) | Simple text files, testing | ❌ |
| Balanced | ⚡ | ⭐⭐⭐⭐ | ⭐⭐⭐⭐ | ✅ | Complex layouts, mixed content | ✅ |
| Accurate | ⚡ | ⭐⭐⭐⭐⭐ | ⭐⭐⭐⭐⭐ | ✅ | Premium accuracy needed | ✅ |
| VLM | ⚡⚡⚡ | ⭐⭐⭐⭐⭐ | ⭐⭐⭐ | ❌ | Manuscripts, handwritten documents | ✅ |
| Agentic | ⚡ | ⭐⭐⭐⭐⭐ | ⭐⭐⭐⭐⭐ | ✅ | Complex layouts, multi-page tables, diagrams | ✅ |
Request Example
Response Format
Success Response (200 OK)
Response Fields
| Field | Type | Description |
|---|---|---|
status | string | Processing result (typically “success”) |
message | string | Human-readable success message |
file_name | string | Name of the processed file |
file_size | integer | Size of the file in bytes |
file_type | string | File extension/type |
file_source | string | Source type of the original file |
project_id | string | UUID of the project containing the file |
project_name | string | Name of the project |
partition_method | string | Processing method that was applied |
Code Examples
JavaScript/Node.js
Python
cURL
PHP
Error Responses
Common Error Codes
| Status Code | Error Type | Description |
|---|---|---|
400 | Bad Request | Invalid request format or missing required fields |
401 | Unauthorized | Invalid or missing API token |
403 | Forbidden | Access denied to the specified project |
404 | Not Found | File not found in the project |
500 | Internal Server Error | Processing failure or server error |
Error Response Format
Error Examples
File Not Found (404)
File Not Found (404)
Invalid API Token (401)
Invalid API Token (401)
Processing Failed (500)
Processing Failed (500)
Invalid Method (400)
Invalid Method (400)
When to Reprocess
Poor text extraction
Poor text extraction
Symptoms: Missing text, garbled characters, incomplete contentRecommended methods:
- Balanced or Accurate for complex layouts
- VLM for text-only documents when bounding boxes are not required
Table detection issues
Table detection issues
Symptoms: Tables not properly recognized, merged cells, structure lostRecommended methods:
- Balanced for better table detection
- Accurate for complex table structures
- Agentic for multi-page tables
Image and figure handling
Image and figure handling
Symptoms: Missing captions, poor figure recognitionRecommended methods:
- Balanced for figure detection
- Accurate for comprehensive image analysis
- Agentic for rich image annotations
Document structure problems
Document structure problems
Symptoms: Headers/footers mixed with content, poor section detectionRecommended methods:
- Balanced for structure recognition
- Accurate for complex document hierarchies
- Agentic for enhanced semantic structure and relationships
Best Practices
Processing Strategy
- Start with Fast: For testing and simple documents
- Upgrade gradually: Move to Balanced → Accurate → VLM → Agentic based on needs
- Monitor results: Use document preview to evaluate processing quality
- Consider efficiency vs. quality: Advanced methods take longer but provide better results
Performance Optimization
- Batch processing: Process multiple files sequentially rather than simultaneously
- Method selection: Choose the appropriate method for your document types
- Timeout handling: Allow sufficient time for complex processing methods
- Error recovery: Implement retry logic for transient failures
Quality Assessment
After processing, evaluate the results by:- Checking text extraction completeness
- Verifying table and figure recognition
- Reviewing document structure classification
- Testing retrieval quality in your RAG pipeline
Integration Examples
Automatic Quality Improvement
Batch Reprocessing
Processing with Progress Tracking
Troubleshooting
Processing timeouts
Processing timeouts
Causes: Large files, complex documents, or heavy server loadSolutions:
- Increase request timeout (5+ minutes recommended)
- Try a simpler processing method first
- Process during off-peak hours
- Contact support for very large documents
File not found errors
File not found errors
Causes: Incorrect file name, file deleted, or wrong projectSolutions:
- Verify exact file name (case-sensitive)
- Use the List Sources endpoint to check available files
- Ensure you’re using the correct API token for the project
Processing failures
Processing failures
Causes: Corrupted files, unsupported content, or method incompatibilitySolutions:
- Try a different processing method
- Check file integrity
- Re-upload the file if necessary
- Contact support for persistent issues
Poor processing quality
Poor processing quality
Causes: Method not suitable for document type, or complex layoutSolutions:
- Upgrade to Balanced or Accurate method
- Use VLM for manuscripts and handwritten documents
- Use Agentic for complex layouts with tables and diagrams
- Ensure document quality is good
- Review processing results in the dashboard

