aiยท 2 min read
RAG Pipelines: Beyond Naive Retrieval
Most RAG implementations leave accuracy on the table. Here's how to improve retrieval with re-ranking, hybrid search, and contextual chunking.
The Problem with Naive RAG
Most tutorials show you the basics: chunk documents, embed them, retrieve the top-K chunks, stuff them into a prompt. This works for demos but fails in production because:
- Chunks lose context from surrounding text
- Embedding similarity doesn't equal relevance
- Top-K retrieval misses nuanced matches
- No feedback loop for improvement
Better Chunking Strategies
Contextual Chunking
Instead of fixed-size chunks, use document structure:
def contextual_chunk(document: str) -> list[str]:
"""Chunk by headers, preserving parent context."""
sections = split_by_headers(document)
chunks = []
for section in sections:
# Prepend parent header chain for context
context = " > ".join(section.header_chain)
chunks.append(f"{context}\n\n{section.content}")
return chunks
Overlapping Windows
Add overlap between chunks so important passages at boundaries aren't lost:
def overlapping_chunks(text: str, size: int = 512, overlap: int = 64):
words = text.split()
chunks = []
for i in range(0, len(words), size - overlap):
chunks.append(" ".join(words[i:i + size]))
return chunks
Hybrid Search
Combine dense (embedding) and sparse (BM25) retrieval:
Loading diagram...
Re-ranking
After initial retrieval, use a cross-encoder to re-score results:
from sentence_transformers import CrossEncoder
reranker = CrossEncoder("cross-encoder/ms-marco-MiniLM-L-6-v2")
def rerank(query: str, passages: list[str], top_k: int = 5):
pairs = [(query, p) for p in passages]
scores = reranker.predict(pairs)
ranked = sorted(zip(scores, passages), reverse=True)
return [p for _, p in ranked[:top_k]]
Results
On our internal documentation corpus (10K pages), these improvements yielded:
- Contextual chunking: +18% relevance vs fixed-size
- Hybrid search: +12% recall vs embedding-only
- Re-ranking: +22% precision in top-5 results
The combination of all three brought our RAG pipeline from "sometimes useful" to "reliably accurate."
Related Posts
0 Comments
Loading comments...