RAPTOR RAG nodes are advanced hierarchical RAG components that construct multi-level tree structures from documents using sophisticated clustering algorithms and recursive abstraction. These nodes implement the RAPTOR (Recursive Abstractive Processing for Tree-Organized Retrieval) methodology.

Benefits of RAPTOR RAG Nodes

Hierarchical Document Abstraction

RAPTOR RAG nodes excel at creating multi-level semantic hierarchies from document collections, using Gaussian Mixture Model clustering to group semantically similar content and LLM-powered summarization to create abstract representations at each tree level.

Advanced Tree Construction

Unlike traditional flat retrieval systems, RAPTOR nodes build recursive tree structures where each level represents a different granularity of abstraction, enabling queries to traverse from specific details to high-level concepts seamlessly.

Multi-Level Retrieval Strategy

RAPTOR trees support intelligent traversal strategies that can retrieve relevant content from multiple abstraction levels simultaneously, providing both detailed information and contextual understanding.

Available Endpoints

Core Concepts

RAPTOR RAG Node Structure

{
  "id": "raptor-rag-1748287628685",
  "type": "raptor-rag",
  "data": {
    "name": "Hierarchical RAPTOR RAG",
    "config": {
      "topK": 25,
      "max_level": 4
    },
    "result": {
      "updated": true,
      "tree_levels": 4,
      "total_clusters": 65,
      "total_summaries": 45,
      "total_processed": 1850,
      "total_chunks": 520,
      "total_retrieved": 80
    }
  }
}

Configuration Parameters

ParameterTypeRangeDescription
topKinteger | null1-100 or nullNumber of top results to retrieve from hierarchical tree traversal
max_levelinteger2-8Maximum depth of tree hierarchy for recursive abstraction

Hierarchical Tree Metrics

MetricDescriptionOptimization Impact
tree_levelsActual levels built in the hierarchyHigher levels = richer abstractions
total_clustersClusters created across all tree levelsMore clusters = finer granularity
total_summariesSummary nodes generated through abstractionMore summaries = better hierarchy quality
clustering_ratioclusters/chunks ratioOptimal range: 0.5-0.8 for balanced structure
summarization_ratiosummaries/clusters ratioHigher ratios indicate effective abstraction
tree_densitysummaries per levelBalanced density ensures traversal efficiency

RAPTOR Tree Strategies

1. Precision-Focused Strategy

Optimal for: High-accuracy applications requiring focused hierarchical retrieval
{
  "config": {
    "topK": 10,
    "max_level": 3
  }
}
Characteristics:
  • Tree Depth: Standard 3-level hierarchy
  • Retrieval Scope: Highly selective with 10 top results
  • Processing Speed: Fast tree construction and traversal
  • Memory Usage: Low (~240MB estimated)
Best Use Cases:
  • Legal document analysis with precise precedent matching
  • Medical research requiring accurate diagnostic information
  • Technical specification lookup with exact parameter matching

2. Balanced Hierarchy Strategy

Optimal for: General-purpose applications requiring comprehensive coverage
{
  "config": {
    "topK": 25,
    "max_level": 4
  }
}
Characteristics:
  • Tree Depth: Extended 4-level hierarchy for richer abstractions
  • Retrieval Scope: Balanced coverage with 25 results
  • Processing Speed: Moderate construction time with good traversal efficiency
  • Memory Usage: Medium (~385MB estimated)
Best Use Cases:
  • Knowledge management systems with diverse content types
  • Research paper analysis across multiple domains
  • Documentation systems requiring hierarchical navigation

3. Comprehensive Coverage Strategy

Optimal for: Exploratory analysis requiring extensive hierarchical insights
{
  "config": {
    "topK": 50,
    "max_level": 5
  }
}
Characteristics:
  • Tree Depth: Deep 5-level hierarchy with maximum abstraction layers
  • Retrieval Scope: Extensive coverage with 50 results
  • Processing Speed: Longer construction time with comprehensive traversal
  • Memory Usage: High (~620MB estimated)
Best Use Cases:
  • Literature review systems requiring exhaustive topic coverage
  • Discovery research with broad conceptual exploration
  • Comprehensive content analysis across large corpora

4. Unlimited Exploration Strategy

Optimal for: Research applications requiring complete hierarchical coverage
{
  "config": {
    "topK": null,
    "max_level": 6
  }
}
Characteristics:
  • Tree Depth: Maximum 6-level hierarchy with deepest abstractions
  • Retrieval Scope: Unlimited results from complete tree traversal
  • Processing Speed: Resource-intensive with comprehensive coverage
  • Memory Usage: Very High (~1000MB+ estimated)
Best Use Cases:
  • Academic research requiring exhaustive literature analysis
  • Comprehensive surveys across multiple research domains
  • Advanced knowledge exploration systems

Strategy Selection Matrix

Use Case TypeDocument CountComplexityRecommended StrategyTop KMax Level
Legal Analysis100-500HighPrecision-Focused103
Medical Research200-800HighPrecision-Focused153
Knowledge Base500-2000MediumBalanced Hierarchy254
Research Papers800-3000MediumBalanced Hierarchy304
Literature Review1000-5000HighComprehensive Coverage505
Discovery Research2000+Very HighComprehensive Coverage605
Academic Survey3000+Very HighUnlimited Explorationnull6
Multi-Domain Analysis5000+Very HighUnlimited Explorationnull6

Basic Integration Example

JavaScript RAPTOR RAG Configuration

class RaptorRagManager {
  constructor(flowName, apiToken) {
    this.flowName = flowName;
    this.apiToken = apiToken;
    this.baseUrl = `https://${flowName}.flows.graphorlm.com`;
  }

  async updateRaptorConfiguration(nodeId, config) {
    const response = await fetch(`${this.baseUrl}/raptor-rag/${nodeId}`, {
      method: "PATCH",
      headers: {
        Authorization: `Bearer ${this.apiToken}`,
        "Content-Type": "application/json",
      },
      body: JSON.stringify({ config }),
    });

    if (!response.ok) {
      throw new Error(`HTTP ${response.status}: ${response.statusText}`);
    }

    return await response.json();
  }

  async listRaptorNodes() {
    const response = await fetch(`${this.baseUrl}/raptor-rag`, {
      headers: { Authorization: `Bearer ${this.apiToken}` },
    });

    if (!response.ok) {
      throw new Error(`HTTP ${response.status}: ${response.statusText}`);
    }

    return await response.json();
  }
}

// Usage Example
const raptorManager = new RaptorRagManager("my-rag-pipeline", "YOUR_API_TOKEN");

// Configure RAPTOR RAG node
raptorManager
  .updateRaptorConfiguration("raptor-rag-1748287628685", {
    topK: 25,
    max_level: 4,
  })
  .then((result) => {
    console.log("✅ RAPTOR configuration updated successfully");
    console.log(`Node ID: ${result.node_id}`);
  })
  .catch((error) => console.error("❌ Configuration update failed:", error));

Best Practices

Hierarchical Tree Design

  • Document Collection Analysis: Always analyze document characteristics before selecting RAPTOR configuration
  • Strategy Selection: Choose strategies based on use case requirements, not arbitrary preferences
  • Tree Depth Optimization: Balance abstraction richness with processing performance for optimal results
  • Clustering Quality: Monitor clustering ratios to ensure effective semantic grouping across tree levels

Performance Optimization

  • Memory Management: Plan memory allocation for deep hierarchical trees, especially with large document collections
  • Processing Efficiency: Use document-aware configuration to optimize tree construction time
  • Retrieval Strategy: Balance Top K values with traversal efficiency for optimal query performance
  • Resource Monitoring: Continuously monitor tree construction and retrieval performance metrics

Configuration Management

  • Dynamic Optimization: Adjust configurations based on actual performance metrics and user feedback
  • Strategy Evolution: Evolve from precision-focused to comprehensive strategies as document collections grow
  • Quality Assessment: Regularly evaluate clustering and summarization quality across tree levels
  • Performance Tracking: Maintain historical performance data to identify optimization trends

Troubleshooting

Next Steps

Explore advanced RAPTOR RAG capabilities and integration patterns: