Endpoint overview
HTTP Method
POST
Endpoint URL
Authentication
This endpoint requires authentication using an API token. Include your API token as a Bearer token in theAuthorization header.
Learn how to create and manage API tokens in the API Tokens guide.
Async flow
- POST
/reprocesswithfile_idand optionalpartition_method. The response returns immediately with abuild_id. - Poll Get build status: GET
https://sources.graphorlm.com/builds/{build_id}untilstatusisCompletedor indicates failure. - Use the
file_idfrom the build status response (unchanged) for subsequent API calls.
Request format
Headers
| Header | Value | Required |
|---|---|---|
Authorization | Bearer YOUR_API_TOKEN | Yes |
Content-Type | application/json | Yes |
Request body
Send a JSON body with the following fields:| Field | Type | Description | Required |
|---|---|---|---|
file_id | string | Unique identifier of the source to re-process | Yes |
method | string | Partitioning strategy. One of: fast, balanced, accurate, vlm, agentic. Default: fast | No |
Partition method values (v2)
Use these values for thepartition_method field:
| Value | Name | Description |
|---|---|---|
fast | Fast | Fast processing with heuristic classification. No OCR. |
balanced | Balanced | OCR-based extraction with structure classification. |
accurate | Accurate | Fine-tuned model for highest accuracy (Premium). |
vlm | VLM | Best for manuscripts and handwritten content. |
agentic | Agentic | Highest accuracy for complex layouts, tables, and diagrams. |
Available processing methods
Fast
Fast
Best for: Simple text documents, quick processing
- Fast processing with heuristic classification
- No OCR processing
- Suitable for plain text files and well-structured documents
- Recommended for testing and development
Balanced
Balanced
Best for: Complex documents with varied layouts
- OCR-based text extraction
- AI-powered document structure classification
- Better recognition of tables, figures, and document elements
- Enhanced accuracy for complex layouts
Accurate
Accurate
Best for: Premium accuracy, specialized documents
- OCR-based text extraction
- Fine-tuned AI model for document classification
- Highest accuracy for document structure recognition
- Note: Premium feature
VLM
VLM
Best for: Text-first parsing, manuscripts, and handwritten documents
- Best text-first parsing; no bounding boxes or page layout
- Best for manuscript and handwritten documents
- Performs page and document annotation
- Best-in-class text parsing quality
Agentic
Agentic
Best for: Complex layouts, multi-page tables, diagrams, and images
- Highest parsing setting for complex layouts
- Rich annotations for images and complex elements
- Agentic processing for enhanced understanding
Method comparison
| Method | Speed | Text parsing | Element classification | Bounding boxes | Best use cases | OCR |
|---|---|---|---|---|---|---|
| Fast | High | Good | Good | Yes (limited) | Simple text files, testing | No |
| Balanced | Medium | Very good | Very good | Yes | Complex layouts, mixed content | Yes |
| Accurate | Medium | Excellent | Excellent | Yes | Premium accuracy needed | Yes |
| VLM | High | Excellent | Good | No | Manuscripts, handwritten | Yes |
| Agentic | Medium | Excellent | Excellent | Yes | Complex layouts, multi-page tables, diagrams | Yes |
Request example
Response format
Success response (200 OK)
The endpoint returns immediately with a build identifier. It does not wait for processing to finish.Response fields
| Field | Type | Description |
|---|---|---|
build_id | string | Use this ID to poll Get build status |
success | boolean | Whether the re-process job was successfully scheduled |
error | string | null | Error message if the job was not scheduled successfully |
file_id, file_name, status, etc.) and optional parsed elements, call GET /builds/{build_id} (see Upload sources – Get build status).
Code examples
JavaScript/Node.js
Python
cURL
Reprocess and poll until complete
Error responses
Common error codes
| Status code | Description |
|---|---|
404 | Source not found for the given file_id |
500 | Processing or unexpected internal error |
Error response format
Error examples
Source not found (404)
Source not found (404)
file_id does not exist in your project.Solution: Verify the
file_id (e.g. from List sources or a previous upload/build status).Processing failed (500)
Processing failed (500)
Solution: Retry later or try a different
partition_method; check file integrity.When to reprocess
Poor text extraction
Poor text extraction
Symptoms: Missing text, garbled characters, incomplete content
Recommended:
Recommended:
balanced or accurate for complex layouts; vlm for text-only when bounding boxes are not needed.Table detection issues
Table detection issues
Symptoms: Tables not recognized, merged cells, structure lost
Recommended:
Recommended:
balanced, accurate, or agentic for multi-page tables.Image and figure handling
Image and figure handling
Symptoms: Missing captions, poor figure recognition
Recommended:
Recommended:
balanced, accurate, or agentic for rich image annotations.Document structure problems
Document structure problems
Symptoms: Headers/footers mixed with content, poor section detection
Recommended:
Recommended:
balanced, accurate, or agentic for better structure and semantics.Best practices
- Use
file_id: Always use the source’sfile_id(from list sources or build status); do not rely on file name. - Poll build status: After calling reprocess, poll Get build status with a reasonable interval (e.g. 2–5 seconds) and timeout.
- Choose method by need: Start with
fastfor testing; usebalancedoraccuratefor better quality; usevlmfor manuscripts; useagenticfor complex layouts and tables. - Timeout: Allow sufficient time for large documents and heavier methods when polling.
Next steps
After re-processing completes (build statusCompleted):
Get build status
Poll status and optionally retrieve parsed elements for a build
List sources
View all sources and their status in your project
Upload sources
Upload new files, URLs, GitHub repos, or YouTube videos (async)
Delete source
Remove a source from your project

