Skip to main content

Overview

The Prebuilt RAG API allows you to retrieve relevant document chunks from your ingested documents using semantic search. It returns the most relevant content for your query, making it ideal for building custom RAG (Retrieval-Augmented Generation) applications.

Endpoint

POST https://sources.graphorlm.com/prebuilt-rag

Authentication

Include your API token in the Authorization header:
Authorization: Bearer YOUR_API_TOKEN

Request

Headers

HeaderValueRequired
AuthorizationBearer YOUR_API_TOKENYes
Content-Typeapplication/jsonYes

Body Parameters

ParameterTypeRequiredDescription
querystringYesThe search query to retrieve relevant chunks
file_namesstring[]NoRestrict retrieval to specific documents by file name

Example Request

curl -X POST "https://sources.graphorlm.com/prebuilt-rag" \
  -H "Authorization: Bearer YOUR_API_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "query": "What are the payment terms?"
  }'

Example with Specific Documents

curl -X POST "https://sources.graphorlm.com/prebuilt-rag" \
  -H "Authorization: Bearer YOUR_API_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "query": "What is the total amount due?",
    "file_names": ["invoice-2024.pdf", "invoice-2023.pdf"]
  }'

Response

Success Response (200 OK)

FieldTypeDescription
querystringThe original search query
chunksarrayList of retrieved document chunks
totalintegerTotal number of chunks retrieved

Chunk Object

Each chunk in the chunks array contains:
FieldTypeDescription
textstringThe text content of the chunk
file_namestringThe source file name
page_numberintegerThe page number where the chunk was found
scorefloatThe relevance score of the chunk (higher is more relevant)
metadataobjectAdditional metadata for the chunk

Example Response

{
  "query": "What are the payment terms?",
  "chunks": [
    {
      "text": "Payment Terms: Net 30 days from invoice date. Late payments will incur a 1.5% monthly interest charge. All payments must be made in USD via wire transfer or check.",
      "file_name": "contract-2024.pdf",
      "page_number": 5,
      "score": 0.95,
      "metadata": {
        "file_name": "contract-2024.pdf",
        "page_number": 5
      }
    },
    {
      "text": "The Client agrees to pay all invoices within thirty (30) days of receipt. Failure to pay within the specified period may result in suspension of services.",
      "file_name": "contract-2024.pdf",
      "page_number": 12,
      "score": 0.87,
      "metadata": {
        "file_name": "contract-2024.pdf",
        "page_number": 12
      }
    }
  ],
  "total": 2
}

Error Responses

Status CodeDescription
400Bad Request - Invalid parameters
401Unauthorized - Invalid or missing API token
404Not Found - Specified file not found
500Internal Server Error

Usage Examples

Python

import requests

url = "https://sources.graphorlm.com/prebuilt-rag"
headers = {
    "Authorization": "Bearer YOUR_API_TOKEN",
    "Content-Type": "application/json"
}

# Basic retrieval
response = requests.post(url, headers=headers, json={
    "query": "What are the key contract terms?"
})

data = response.json()
print(f"Found {data['total']} relevant chunks")

for chunk in data["chunks"]:
    print(f"\n--- {chunk['file_name']} (page {chunk['page_number']}) ---")
    print(chunk["text"])
    print(f"Relevance: {chunk['score']:.2f}")

Python with Custom LLM Integration

import requests
from openai import OpenAI

# Step 1: Retrieve relevant chunks
def retrieve_chunks(query, file_names=None):
    url = "https://sources.graphorlm.com/prebuilt-rag"
    headers = {
        "Authorization": "Bearer YOUR_API_TOKEN",
        "Content-Type": "application/json"
    }
    payload = {"query": query}
    if file_names:
        payload["file_names"] = file_names
    
    response = requests.post(url, headers=headers, json=payload)
    return response.json()

# Step 2: Build context from chunks
def build_context(chunks):
    context = ""
    for chunk in chunks["chunks"]:
        context += f"\n[Source: {chunk['file_name']}, Page {chunk['page_number']}]\n"
        context += chunk["text"] + "\n"
    return context

# Step 3: Generate answer with your preferred LLM
def generate_answer(question, context):
    client = OpenAI()
    response = client.chat.completions.create(
        model="gpt-4",
        messages=[
            {"role": "system", "content": "Answer questions based on the provided context."},
            {"role": "user", "content": f"Context:\n{context}\n\nQuestion: {question}"}
        ]
    )
    return response.choices[0].message.content

# Usage
question = "What are the payment terms?"
chunks = retrieve_chunks(question)
context = build_context(chunks)
answer = generate_answer(question, context)
print(answer)

JavaScript

const API_URL = "https://sources.graphorlm.com/prebuilt-rag";
const API_TOKEN = "YOUR_API_TOKEN";

async function retrieveChunks(query, fileNames = null) {
  const payload = { query };
  if (fileNames && fileNames.length) {
    payload.file_names = fileNames;
  }

  const response = await fetch(API_URL, {
    method: "POST",
    headers: {
      "Authorization": `Bearer ${API_TOKEN}`,
      "Content-Type": "application/json"
    },
    body: JSON.stringify(payload)
  });
  
  return response.json();
}

// Usage
const result = await retrieveChunks("What products are mentioned?");
console.log(`Found ${result.total} relevant chunks`);

result.chunks.forEach(chunk => {
  console.log(`\n--- ${chunk.file_name} (page ${chunk.page_number}) ---`);
  console.log(chunk.text);
  console.log(`Relevance: ${chunk.score}`);
});

JavaScript with Custom LLM Integration

const API_URL = "https://sources.graphorlm.com/prebuilt-rag";
const API_TOKEN = "YOUR_API_TOKEN";

async function retrieveChunks(query, fileNames = null) {
  const payload = { query };
  if (fileNames && fileNames.length) payload.file_names = fileNames;

  const response = await fetch(API_URL, {
    method: "POST",
    headers: {
      "Authorization": `Bearer ${API_TOKEN}`,
      "Content-Type": "application/json"
    },
    body: JSON.stringify(payload)
  });
  
  return response.json();
}

function buildContext(chunks) {
  return chunks.chunks.map(chunk => 
    `[Source: ${chunk.file_name}, Page ${chunk.page_number}]\n${chunk.text}`
  ).join("\n\n");
}

async function generateAnswer(question, context) {
  // Use your preferred LLM API (OpenAI, Anthropic, etc.)
  const response = await fetch("https://api.openai.com/v1/chat/completions", {
    method: "POST",
    headers: {
      "Authorization": `Bearer ${OPENAI_API_KEY}`,
      "Content-Type": "application/json"
    },
    body: JSON.stringify({
      model: "gpt-4",
      messages: [
        { role: "system", content: "Answer questions based on the provided context." },
        { role: "user", content: `Context:\n${context}\n\nQuestion: ${question}` }
      ]
    })
  });
  
  const data = await response.json();
  return data.choices[0].message.content;
}

// Full RAG pipeline
async function askQuestion(question, fileNames = null) {
  const chunks = await retrieveChunks(question, fileNames);
  const context = buildContext(chunks);
  const answer = await generateAnswer(question, context);
  return { answer, sources: chunks.chunks };
}

// Usage
const result = await askQuestion("What are the payment terms?");
console.log(result.answer);
console.log("Sources:", result.sources.map(s => s.file_name));

Use Cases

Custom RAG Applications

Use this API to build custom RAG pipelines with your preferred LLM:
  1. Retrieve relevant chunks using semantic search
  2. Build context from the retrieved chunks
  3. Generate answers using any LLM (OpenAI, Anthropic, Google, etc.)
Build document search interfaces that show relevant excerpts:
  1. Query for relevant content
  2. Display chunks with file names and page numbers
  3. Allow users to navigate to source documents

Knowledge Base Q&A

Create custom Q&A systems with full control over:
  • Prompt engineering
  • Response formatting
  • Source citations
  • Multi-step reasoning

Best Practices

  1. Be specific in queries — Clear, specific queries return more relevant chunks
  2. Use file_names for focused search — Restrict to specific documents when you know the source
  3. Check relevance scores — Higher scores indicate better matches; consider filtering low-score chunks
  4. Include source citations — Use file_name and page_number to cite sources in your responses
  5. Combine with LLM — Use retrieved chunks as context for LLM-generated answers

Comparison with Chat API

FeaturePrebuilt RAG APIChat API
ReturnsRaw document chunksGenerated answer
LLM IntegrationBring your ownBuilt-in
Conversation MemoryNoYes
CustomizationFull controlLimited
Use CaseCustom RAG pipelinesQuick Q&A