
Overview
The Reranking node enhances your RAG pipeline by:- Receiving retrieved documents — Takes results from Retrieval or RAG nodes
- Scoring relevance — Uses LLM to evaluate how relevant each document is to the query
- Reordering results — Sorts documents by relevance score (highest first)
- Filtering top results — Returns only the top-K most relevant documents
Reranking is particularly useful when initial retrieval returns many results of varying quality. The LLM-based scoring provides a more nuanced relevance assessment than vector similarity alone.
When to Use Reranking
| Scenario | Recommendation |
|---|---|
| High Top-K retrieval (10+) | ✅ Recommended — Filter down to most relevant |
| Complex queries | ✅ Recommended — LLM better understands nuanced relevance |
| Domain-specific content | ✅ Recommended — LLM can assess semantic relevance |
| Simple keyword queries | ⚠️ Optional — Vector similarity may be sufficient |
| Low latency requirements | ⚠️ Consider tradeoffs — Adds LLM calls |
Using the Reranking Node
Adding the Reranking Node
- Open your flow in the Flow Builder
- Drag the Reranking node from the sidebar onto the canvas
- Connect a retrieval node to the Reranking input
- Double-click the Reranking node to configure
Input Connections
The Reranking node accepts input from:| Source Node | Use Case |
|---|---|
| Retrieval | Rerank results from standard vector retrieval |
| Smart RAG | Rerank results from Smart RAG |
| Agentic RAG | Rerank results from Agentic RAG |
| Graph RAG | Rerank results from Graph RAG |
| Raptor RAG | Rerank results from Raptor RAG |
Output Connections
The Reranking node can connect to:| Target Node | Use Case |
|---|---|
| LLM | Generate responses using reranked context |
| Analysis | Evaluate pipeline performance |
| Response | Output reranked results directly |
Configuring the Reranking Node
Double-click the Reranking node to open the configuration panel:
Top K
The number of documents to return after reranking:| Value | Use Case |
|---|---|
| 1-3 | When you need only the most relevant result |
| 4-6 | Balanced approach for most Q&A applications |
| 7-10 | When broader context is needed |
| 10+ | Comprehensive coverage, higher token usage |
How Reranking Works
Scoring Process
- Document Preparation — Each retrieved document is formatted for LLM evaluation
- Relevance Assessment — LLM scores each document’s relevance to the query (0.0 to 1.0)
- Token Management — Large documents are intelligently truncated to fit model context
- Parallel Processing — Documents are scored in parallel for efficiency
Metadata Added
After reranking, each document includes additional metadata:| Field | Description |
|---|---|
rerank_score | Relevance score from 0.0 (irrelevant) to 1.0 (highly relevant) |
rerank_position | New position after reranking (1 = most relevant) |
original_score | Original retrieval score for comparison |
Pipeline Examples
Standard Reranking Pipeline
Evaluation Pipeline with Reranking
Smart RAG with Reranking
Viewing Results
After running the pipeline (click Update Results):- Results show documents grouped by question
- Each document displays:
- Question — The query being answered
- Content — Document text
- Rerank Score — LLM-assigned relevance score
- Rerank Position — New ranking position
- Original metadata — File name, page number, etc.
JSON View
Toggle JSON to see the raw result structure:Performance Considerations
Latency Impact
Reranking adds LLM calls to your pipeline:| Documents | Approximate Additional Time |
|---|---|
| 5 | ~1-2 seconds |
| 10 | ~2-4 seconds |
| 20 | ~4-8 seconds |
Documents are scored in parallel batches, so the relationship isn’t strictly linear. Actual times depend on document size and LLM response time.
Token Usage
Each document requires tokens for:- Query text
- Document content (truncated if needed)
- Scoring prompt template
Optimization Tips
- Reduce retrieval Top-K — Retrieve fewer documents to rerank
- Use efficient chunking — Smaller chunks = faster scoring
- Balance quality vs. speed — Not all pipelines need reranking
Best Practices
- Use with high Top-K retrieval — Reranking adds most value when filtering many results
- Position before LLM — Rerank first, then generate responses with better context
- Monitor scores — Low rerank scores across the board may indicate retrieval issues
- Compare with/without — Use Analysis node to measure the impact of reranking
Troubleshooting
Reranking is slow
Reranking is slow
To improve performance:
- Reduce Top-K in the upstream Retrieval node
- Use smaller chunk sizes in Chunking
- Consider if reranking is necessary for your use case
All documents have similar scores
All documents have similar scores
If scores cluster together:
- Query may be too broad or vague
- Documents may all be equally relevant
- Check if retrieval is returning appropriate content
Good documents getting low scores
Good documents getting low scores
If relevant documents score poorly:
- Verify chunking preserves semantic meaning
- Check if documents are being truncated too aggressively
- Review the query phrasing
Reranking fails with errors
Reranking fails with errors
If seeing errors:
- Check LLM API connectivity
- Verify API tokens are valid
- The node has built-in retries; persistent failures indicate infrastructure issues

